A Python library for actuarial loss modeling using frequency–severity methods.
lossmodels is a readable, NumPy/SciPy-backed implementation of the core
loss-modeling toolkit: a large catalog of severity and frequency distributions,
aggregate (collective-risk) loss models, coverage modifications, and fitting and
model-selection utilities. It is aimed at actuarial students, analysts, insurance
data scientists, and quantitative developers who want a lightweight library whose
parameterizations line up with the standard references.
The continuous-severity and discrete-frequency inventories use the standard scale-parameterized forms common in actuarial loss modeling, so means, moments, and quantiles line up with the formulas you would compute by hand.
- Severity models — a comprehensive continuous inventory: lognormal, gamma, Weibull, exponential and their inverses; the transformed-beta family (Burr, inverse Burr, generalized Pareto, Pareto/Lomax, inverse Pareto, loglogistic, paralogistic, inverse paralogistic); inverse Gaussian, log-t, single-parameter Pareto; and finite-support beta / generalized beta.
- Frequency models — the
(a, b, 0)class (Poisson, binomial, geometric, negative binomial) and the(a, b, 1)class (zero-truncated, zero-modified, logarithmic). - Spliced / composite severities — a body distribution below a threshold and a heavier tail above it, behind a single severity interface.
- Empirical distributions for raw frequency and severity data.
- Aggregate loss modeling — Monte Carlo simulation, Panjer recursion, and FFT aggregation, plus distribution-level and PMF-level risk measures (VaR, TVaR, stop-loss, limited expected value).
- Coverage modifications — ordinary deductibles, policy limits, and layers, each wrapping any severity and re-usable everywhere a severity is accepted.
- Estimation — maximum likelihood and method-of-moments fitters, log-likelihood / AIC / BIC diagnostics, goodness-of-fit statistics (KS, Anderson–Darling, Cramér–von Mises), and automatic best-fit selection.
From PyPI:
pip install lossmodelsIn development mode from a source checkout:
pip install -e .Requires Python >=3.10; the only runtime dependencies are numpy and scipy.
lossmodels/
├── severity/ # continuous claim-severity distributions
├── frequency/ # discrete claim-count distributions
├── aggregate/ # collective-risk models, discretization, Panjer, FFT, risk metrics
├── coverage/ # deductibles, limits, and layers
├── empirical/ # empirical frequency and severity distributions
├── estimation/ # MLE, method of moments, diagnostics, model selection
└── utils/ # shared numeric / validation helpers
from lossmodels import Poisson, Lognormal, CollectiveRiskModel
freq = Poisson(lam=2.0)
sev = Lognormal(mu=10.0, sigma=0.8)
model = CollectiveRiskModel(freq, sev)
print("Mean:", model.mean())
print("Variance:", model.variance())
print("VaR 95%:", model.var(0.95))
print("TVaR 95%:", model.tvar(0.95))
samples = model.sample(50_000)
print("Simulated mean:", samples.mean())Every model name is importable directly from the top level
(from lossmodels import Burr, ZeroTruncated, ...) or from its submodule
(from lossmodels.severity import Burr).
from lossmodels import Burr, ParetoII, ZeroTruncated, Poisson, CollectiveRiskModel
sev = ParetoII(alpha=3.0, theta=1000.0) # the two-parameter "Pareto" (Type II / Lomax)
freq = ZeroTruncated(Poisson(2.0)) # an (a, b, 1) frequency
model = CollectiveRiskModel(freq, sev)
print(model.mean(), model.var(0.99))
burr = Burr(alpha=3.0, theta=1000.0, gamma=2.0)
print(burr.mean(), burr.cdf(2500.0), burr.quantile(0.95))
print(burr.limited_expected_value(1000.0)) # E[min(X, 1000)]Every severity exposes the same interface, so they are interchangeable across the aggregate, coverage, and estimation tooling:
| Method | Meaning |
|---|---|
pdf(x), cdf(x) |
density and distribution function (scalar or array) |
quantile(p) / ppf(p) |
inverse CDF / Value-at-Risk |
sample(size) |
random variates |
mean(), variance(), std() |
moments (raise where they do not exist) |
limited_expected_value(d) |
E[min(X, d)] |
excess_loss(d) |
E[(X − d)₊], the expected cost per loss above d |
Moments are returned from closed-form expressions and raise a
ValueError outside their existence range (e.g. a Pareto mean with alpha ≤ 1),
rather than silently returning a wrong or infinite number. A few distributions
(inverse Pareto, inverse exponential, log-t) have no finite positive moments, so
their mean/variance always raise.
| Distribution | Constructor | Support |
|---|---|---|
| Exponential | Exponential(rate) |
x > 0 |
| Gamma | Gamma(alpha, theta) |
x > 0 |
| Lognormal | Lognormal(mu, sigma) |
x > 0 |
| Weibull | Weibull(k, lam) |
x > 0 |
| Pareto (Type I) | Pareto(alpha, theta) |
x ≥ theta |
| Distribution | Constructor |
|---|---|
| Burr (Type XII / Singh–Maddala) | Burr(alpha, theta, gamma) |
| Inverse Burr (Dagum) | InverseBurr(tau, theta, gamma) |
| Generalized Pareto (transformed-beta) | GeneralizedPareto(alpha, theta, tau) |
| Pareto (Type II / Lomax) | ParetoII(alpha, theta) |
| Inverse Pareto | InversePareto(tau, theta) |
| Loglogistic (Fisk) | Loglogistic(gamma, theta) |
| Paralogistic | Paralogistic(alpha, theta) |
| Inverse paralogistic | InverseParalogistic(tau, theta) |
| Distribution | Constructor |
|---|---|
| Inverse gamma (Vinci) | InverseGamma(alpha, theta) |
| Inverse Weibull (log-Gompertz) | InverseWeibull(theta, tau) |
| Inverse exponential | InverseExponential(theta) |
| Distribution | Constructor |
|---|---|
| Inverse Gaussian (Wald) | InverseGaussian(mu, theta) |
| Log-t | LogT(r, mu, sigma) |
| Single-parameter Pareto (Type I) | SingleParameterPareto(alpha, theta) |
| Distribution | Constructor | Support |
|---|---|---|
| Beta | Beta(a, b, theta) |
0 < x < theta |
| Generalized beta | GeneralizedBeta(a, b, theta, tau) |
0 < x < theta |
A few naming points matter:
- Two Paretos.
ParetoII(alpha, theta)is the two-parameter "Pareto" (Type II / Lomax, supportx > 0). The classicPareto(alpha, theta)is the Type I distribution (supportx ≥ theta), commonly called the single-parameter Pareto; it is also exposed under that name asSingleParameterPareto. GeneralizedParetohere is the three-parameter transformed-beta distribution, not the extreme-value GPD used in peaks-over-threshold tail fitting. The extreme-value distributions (GEV, Gumbel, Fréchet, GPD, Hill) live in the companionextremelosspackage.Exponentialis parameterized byrate(=1/theta) andWeibullby shapekand scalelam; the transformed-beta and transformed-gamma families follow the scale-parameter (theta) convention.
Every frequency model exposes pmf(k), cdf(k), sample(size), mean(),
variance(), and std().
| Distribution | Constructor |
|---|---|
| Poisson | Poisson(lam) |
| Binomial | Binomial(n, p) |
| Geometric | Geometric(p) |
| Negative binomial | NegativeBinomial(r, p) |
The (a, b, 1) distributions are built from a base (a, b, 0) model:
from lossmodels import Poisson, NegativeBinomial, ZeroTruncated, ZeroModified, Logarithmic
zt = ZeroTruncated(Poisson(2.0)) # zero is impossible
zm = ZeroModified(NegativeBinomial(3.0, 1/3), p0_modified=0.4) # custom mass at zero
lg = Logarithmic(beta=3.0) # no (a, b, 0) parent
print(zt.mean(), zt.pmf(0)) # 0.0 mass at zero
print(zm.mean(), zm.pmf(0)) # 0.4 at zero
print(lg.mean())| Distribution | Constructor | Notes |
|---|---|---|
| Zero-truncated | ZeroTruncated(base) |
removes the mass at zero of any (a, b, 0) model and renormalizes |
| Zero-modified | ZeroModified(base, p0_modified) |
places an arbitrary probability p0_modified ∈ [0, 1) at zero |
| Logarithmic | Logarithmic(beta) |
the limiting member with no (a, b, 0) parent |
The wrappers respect the base model's probability parameterization; only
Logarithmic uses beta. For a beta-parameterized geometric or negative
binomial, set p = 1 / (1 + beta).
SplicedSeverity joins a body distribution (renormalized below the threshold) to a
tail distribution supported above it. The tail must satisfy cdf(threshold) = 0
(e.g. a Type I Pareto with theta equal to the threshold), and the mixing
weight is the probability mass assigned to the body.
from lossmodels import Lognormal, Pareto, SplicedSeverity
u = 50_000
body = Lognormal(mu=10.0, sigma=0.8)
tail = Pareto(alpha=2.5, theta=u) # Type I tail, cdf(u) = 0
spliced = SplicedSeverity(body=body, tail=tail, threshold=u, weight=body.cdf(u))
print(spliced.mean(), spliced.quantile(0.99))The result is itself a severity (pdf, cdf, quantile, sample, mean,
variance), so a tail-corrected severity drops straight back into the aggregate
and coverage tooling.
from lossmodels import EmpiricalSeverity, EmpiricalFrequency
sev = EmpiricalSeverity(claim_amounts)
freq = EmpiricalFrequency(claim_counts)Both expose the standard severity / frequency interface built from the raw data.
Deductibles, limits, and layers each wrap a severity and are themselves severities, so they compose and feed any aggregate model.
from lossmodels import Lognormal, Poisson, CollectiveRiskModel
from lossmodels.coverage import OrdinaryDeductible, PolicyLimit, Layer
base_sev = Lognormal(mu=10.0, sigma=0.8)
with_deductible = OrdinaryDeductible(base_sev, d=10_000)
with_limit = PolicyLimit(base_sev, u=50_000)
layer = Layer(base_sev, d=10_000, u=40_000) # pays the loss in the layer above d up to u
model = CollectiveRiskModel(Poisson(lam=2.0), layer)
print(model.mean())| Modification | Constructor |
|---|---|
| Ordinary deductible | OrdinaryDeductible(severity, d) |
| Policy limit | PolicyLimit(severity, u) |
| Layer | Layer(severity, d, u) |
CollectiveRiskModel(frequency, severity) is the main entry point. It exposes:
| Method | Meaning |
|---|---|
mean(), variance(), std() |
aggregate moments |
var(q), tvar(q) |
Value-at-Risk and Tail-VaR at level q |
stop_loss(d) |
stop-loss premium E[(S − d)₊] |
limited_expected_value(d) |
E[min(S, d)] |
frequency_mean(), severity_mean() |
component means |
sample(size) |
simulate aggregate losses |
summary() |
a one-call overview of the model |
For exact (discretized) aggregate distributions, discretize the severity and then recurse or transform:
from lossmodels import Poisson, Lognormal
from lossmodels.aggregate import (
discretize_severity, panjer_recursion, fft_aggregate_poisson,
)
freq = Poisson(lam=3.0)
sev = Lognormal(mu=8.0, sigma=0.7)
sev_pmf = discretize_severity(sev, h=100.0, max_loss=500_000.0)
agg_pmf = panjer_recursion(freq, sev_pmf, n_steps=5000)
# FFT aggregation (Poisson frequency):
agg_pmf_fft = fft_aggregate_poisson(freq, sev_pmf, n_steps=5000)Distribution-level helpers operate on a simulated loss vector, and PMF-level helpers operate on a discretized aggregate PMF:
from lossmodels.aggregate import (
var, tvar, stop_loss, lev, # on a sample
var_from_pmf, tvar_from_pmf, stop_loss_from_pmf, # on an aggregate PMF
)
losses = model.sample(100_000)
print(var(losses, 0.95), tvar(losses, 0.95), stop_loss(losses, 50_000), lev(losses, 50_000))
print(var_from_pmf(agg_pmf, h=100.0, q=0.95))
print(tvar_from_pmf(agg_pmf, h=100.0, q=0.95))
print(stop_loss_from_pmf(agg_pmf, h=100.0, d=50_000.0))The estimation module provides MLE and method-of-moments fitters, fit
diagnostics, goodness-of-fit statistics, and automatic best-fit selection.
from lossmodels import (
fit_lognormal, fit_poisson,
fit_best_severity, fit_best_frequency,
goodness_of_fit, aic, bic,
)
sev_model = fit_lognormal(severity_data)
freq_model = fit_poisson(count_data)
best_sev = fit_best_severity(severity_data, criterion="aic")
best_freq = fit_best_frequency(count_data, criterion="bic")
print(best_sev["best_name"], best_sev["best_model"])
print(best_freq["best_name"])fit_best_severity / fit_best_frequency return a dict with best_name,
best_model, criterion, method, and the full per-candidate results.
- Severity fitters and selection candidates:
exponential,gamma,lognormal,pareto,weibull(MLE or method of moments). Also available individually:fit_exponential,fit_gamma,fit_lognormal,fit_pareto,fit_weibull, plus the genericfit_mle. - Frequency fitters and selection candidates:
poisson,negbinomial(fit_poisson,fit_negbinomial). - Diagnostics and goodness of fit:
log_likelihood,aic,bic,ks_statistic,anderson_darling,cramer_von_mises,tail_quantile_table, andgoodness_of_fit.
lossmodels is the distribution-and-aggregation core of a small set of
interoperating actuarial packages. Models built here plug into the others through
their .sample(size) and .mean() interface:
risksim— portfolio simulation and aggregate reinsurance program evaluation (layers, contract programs), consuminglossmodelsseverities and frequencies.extremeloss— extreme value theory: peaks-over-threshold / GPD tails, block maxima / GEV, the Hill estimator, and threshold diagnostics. This is the home of the extreme-value distributions, which is why they are not duplicated inlossmodels.
The repository includes runnable example scripts, including:
fit_and_simulate.py,fit_and_compare_models.pypanjer_vs_simulation.py,panjer_vs_fft_vs_simulation.pydeductible_example.py,limit_example.py,layer_example.pystop_loss_example.py
Run the test suite with:
pytest -qRun only the fast tests with:
pytest -q -m "not slow"MIT License