- Project Overview
- The UGR Distribution
- The CSPALT Setup
- Progressive Type-II Censoring
- Censoring Schemes Used
- Classical Estimation — MLE
- Bayesian Estimation
- MCMC via Metropolis-Hastings Gibbs
- Loss Functions
- Interval Estimation
- Performance Metrics
- Simulation Study
- Real Data Analysis
- Output Files Guide
- How to Run the Code
- Key Results Summary
- Repository Structure
This project is a B.Tech thesis implementing classical and Bayesian inference for the Constant Stress Partially Accelerated Life Test (CSPALT) model, where lifetimes follow the Unit Generalized Rayleigh (UGR) distribution under Progressive Type-II censoring.
Goal: Estimate three model parameters — α (shape), β (scale), λ (acceleration
factor) — using multiple competing methods and compare their performance.
Methods implemented:
| Method | Type | Estimator |
|---|---|---|
| MLE | Classical | Point estimate via L-BFGS-B optimization |
| ACI | Classical | Asymptotic Confidence Interval (Fisher information) |
| Bayes-SELF (Gamma prior) | Bayesian | Posterior mean |
| Bayes-LINEX s=+2 (Gamma prior) | Bayesian | Asymmetric loss estimate |
| Bayes-LINEX s=-2 (Gamma prior) | Bayesian | Asymmetric loss estimate |
| Bayes-SELF (Gamma-Dirichlet prior) | Bayesian | Posterior mean |
| Bayes-LINEX s=+2 (GD prior) | Bayesian | Asymmetric loss estimate |
| Bayes-LINEX s=-2 (GD prior) | Bayesian | Asymmetric loss estimate |
| BCI | Bayesian | Equal-tail credible interval |
The Unit Generalized Rayleigh (UGR) distribution is defined on the interval (0, 1), making it naturally suited for modelling lifetimes expressed as proportions or reliability scores.
A random variable X ~ UGR(α, β) has the following functions (Normal stress condition):
PDF:
f₁(x; α, β) = 2αβ²·(-ln x / x)·exp(-(β·(-ln x))²)·[1 - exp(-(β·(-ln x))²)]^(α-1)
Survival Function:
S₁(x; α, β) = [1 - exp(-(β·(-ln x))²)]^α
CDF:
F₁(x; α, β) = 1 - S₁(x; α, β)
Parameters:
α > 0— shape parameter controlling tail heavinessβ > 0— scale parameter
Inverse CDF (used for simulation):
F₁⁻¹(u) = exp(-sqrt(-ln(1 - (1-u)^(1/α))) / β)
In a Constant Stress Partially Accelerated Life Test (CSPALT), units are split into two groups:
-
Group 1 — Normal stress: Units run under standard operating conditions. Lifetimes follow UGR(α, β).
-
Group 2 — Accelerated stress: Units run under elevated stress. Lifetimes are compressed by an acceleration factor λ > 1, so they follow UGR(αλ, β).
Accelerated condition functions:
f₂(x; α, β, λ) = 2αβ²λ·(-ln x / x)·exp(-(β·(-ln x))²)·[1 - exp(-(β·(-ln x))²)]^(αλ-1)
S₂(x; α, β, λ) = [1 - exp(-(β·(-ln x))²)]^(αλ)
The constraint λ > 1 means the accelerated stress always shortens lifetime — this is
enforced throughout the estimation.
Parameters to estimate: θ = (α, β, λ)
Why censoring? In life-testing, it is impractical (or too expensive) to observe all failures. Censoring lets the experimenter terminate early while preserving statistical information.
Progressive Type-II censoring generalises ordinary Type-II censoring:
- Fix total sample size
nand number of observed failuresm < n. - Specify a censoring scheme R = (R₁, R₂, …, Rₘ) where Rᵢ is the number of surviving units removed at the i-th failure.
- Constraint:
m + ΣRᵢ = n.
At each of the m ordered failures, a predetermined number Rᵢ of the surviving units
are randomly withdrawn from the test. This allows interim removals, not just end-of-study
removals.
In this study: n = 40, m = 30 → 10 total censored units distributed according to the scheme R.
Seven distinct censoring patterns are studied (all satisfy m=30, n=40):
| Scheme | Pattern | Description |
|---|---|---|
| CS1 | R = (0,…,0, 10) | All 10 removals at the last failure — right censoring |
| CS2 | R = (10, 0,…,0) | All 10 removals at the first failure — left censoring |
| CS3 | R = (1,1,…,1,10×ones,0,…) | 1 unit removed at each of the first 10 failures |
| CS4 | R = (0,…,0, 10, 0,…) | 10 removals at the 11th failure — middle censoring |
| CS5 | R = (2,2,2,2,2, 0,…) | 2 units removed at each of the first 5 failures |
| CS6 | R = (0,…,0, 5, 0,…) | 5 removals at the 16th failure — late-middle censoring |
| CS7 | R = (0,…,0, 3, 0,…, 2, 0,…) | Split removals: 3 at step 6, 2 at step 17 |
These schemes test robustness: CS1 is the most information-rich (all data until the end), CS2 is the most aggressive early withdrawal.
Maximum Likelihood Estimation (MLE) finds θ̂ = (α̂, β̂, λ̂) that maximises the log-likelihood function:
ℓ(θ) = Σ ln f₁(x₁ᵢ; α,β) + Σ Rᵢ·ln S₁(x₁ᵢ; α,β)
+ Σ ln f₂(x₂ⱼ; α,β,λ) + Σ Rⱼ·ln S₂(x₂ⱼ; α,β,λ)
The first sum is over the m observed normal-stress failures, with the Rᵢ terms
accounting for the progressively censored units. Similarly for the accelerated group.
Implementation:
optim(par = c(1, 0.5, 2),
fn = logLik_CSPALT,
method = "L-BFGS-B",
lower = c(0.001, 0.001, 1.001)) # enforces λ > 1Standard Errors are derived from the Fisher information matrix (inverse of the
observed Hessian at the MLE), computed numerically via numDeriv::hessian().
Bayesian inference combines the likelihood with a prior distribution over θ via Bayes' theorem:
π(θ | data) ∝ L(data | θ) · π(θ)
Two competing priors are implemented:
Independent Gamma priors on each parameter:
α ~ Gamma(a₁=2, b₁=1)
β ~ Gamma(a₂=2, b₂=1)
λ ~ Gamma(a₃=2, b₃=1)
This is a conjugate-style, weakly informative prior. The hyperparameters (a=2, b=1) give a prior mean of 2 and allow broad uncertainty.
A joint prior that couples α and β through a Dirichlet-like structure:
log π_GD(α, β, λ) =
lgamma(a₁+a₂) - lgamma(a₁) - lgamma(a₂)
+ (a₀ - a₁ - a₂)·ln(α+β)
+ (a₁-1)·ln(α) + (a₂-1)·ln(β)
- b₀·(α+β)
+ ln Gamma(λ; a₃, b₃)
where a₀=3, b₀=1, a₁=2, a₂=2, a₃=2, b₃=1.
The GD prior captures dependence between α and β, which is realistic since both govern the shape of the same distribution. λ remains independent.
Since the posterior has no closed form, samples are drawn using a component-wise Metropolis-Hastings (MH) within Gibbs sampler:
Algorithm:
For each MCMC iteration i = 2, …, 10000:
1. Propose α* ~ |Normal(αᵢ₋₁, σ_α=0.05)|
Accept with probability min(1, π(α*,β,λ|data) / π(αᵢ₋₁,β,λ|data))
2. Propose β* ~ |Normal(βᵢ₋₁, σ_β=0.02)|
Accept with probability min(1, π(α,β*,λ|data) / π(α,βᵢ₋₁,λ|data))
3. Propose λ* ~ Normal(λᵢ₋₁, σ_λ=0.02) [reflected at λ=1]
Accept with probability min(1, π(α,β,λ*|data) / π(α,β,λᵢ₋₁|data))
MCMC Settings:
Total iterations : 10,000
Burn-in : 2,500 (first 25% discarded)
Thinning : 5 (every 5th sample kept)
Effective draws : 1,500 per parameter
The small proposal widths (0.05 for α, 0.02 for β and λ) are tuned to achieve reasonable acceptance rates (~20–50%), visible in the console output.
Convergence is assessed via Trace Plots (see Section 21 of the R code and the PNG files).
Bayesian point estimates depend on the choice of loss function:
L_SELF(θ̂, θ) = (θ̂ - θ)²
The Bayes estimator under SELF is the posterior mean:
θ̂_SELF = E[θ | data] ≈ (1/N) Σ θ⁽ⁱ⁾
L_LINEX(θ̂, θ) = exp(s·(θ̂ - θ)) - s·(θ̂ - θ) - 1
The Bayes estimator under LINEX is:
θ̂_LINEX = -(1/s) · ln E[exp(-s·θ) | data]
≈ -(1/s) · ln[(1/N) Σ exp(-s·θ⁽ⁱ⁾)]
Two values of s are used:
s |
Behaviour |
|---|---|
s = +2 |
Penalises overestimation more heavily |
s = -2 |
Penalises underestimation more heavily |
LINEX is particularly useful in reliability engineering where asymmetric costs apply (e.g., shipping a product with shorter-than-expected life is worse than being conservative).
Based on MLE + Fisher information:
CI: θ̂ ± z_{α/2} · SE(θ̂) where SE = sqrt(diag(Fisher⁻¹))
α = 0.05 → z = 1.96. This interval is valid asymptotically (large sample).
Equal-tail interval from the MCMC posterior sample:
BCI: [Q_{0.025}(θ⁽ⁱ⁾), Q_{0.975}(θ⁽ⁱ⁾)]
Unlike a frequentist CI, the BCI has the direct probability interpretation: "There is a 95% posterior probability that θ lies in this interval."
For a simulation with NSIM = 100 replications, each estimator θ̂ is evaluated by:
| Metric | Formula | Meaning |
|---|---|---|
| AE (Average Estimate) | (1/N)·Σ θ̂ⱼ | Closeness to true value (bias indicator) |
| MSE (Mean Squared Error) | (1/N)·Σ (θ̂ⱼ - θ)² | Combined bias + variance |
| CP (Coverage Probability) | (1/N)·Σ 𝟙[θ ∈ CIⱼ] | Should be ≈ 0.95 for a good CI |
| AL (Average Length) | (1/N)·Σ (Upperⱼ - Lowerⱼ) | Precision — shorter is better |
A good estimator has small MSE and a good CI has CP near 0.95 with small AL.
The full simulation loops over:
- 2 parameter settings:
- Setting 1: (α, β, λ) = (1.6, 0.5, 2)
- Setting 2: (α, β, λ) = (2.5, 1.2, 3)
- 7 censoring schemes (CS1–CS7)
- 100 simulations per configuration
Data generation uses the probability integral transform on the inverse CDF:
Generate_Progressive(n=40, m=30, R=R, alpha, beta, lambda, accelerated=TRUE/FALSE)The algorithm uses the Balakrishnan-Sandhu method for generating progressively censored order statistics via a product of Beta random variables.
Total configurations: 2 × 7 = 14 per method, 9 methods → results in 9 output tables.
Dataset: Insulating fluid breakdown times (in minutes) from a dielectric breakdown experiment. Data are log-transformed to fit the (0,1) support of the UGR distribution:
Normal stress: x_normal = exp(-c(7.74, 17.05, 20.46, 21.02, 22.66,
43.40, 47.30, 139.07))
Accelerated: x_acc = exp(-c(0.27, 0.40, 0.69, 0.79, 2.75,
3.91, 9.88, 13.95, 15.93, 27.80))
Both MLE and Bayesian (Gamma prior, Gamma-Dirichlet prior) estimates are computed on this data, with BCI intervals reported.
Results summary (Gamma prior):
| Parameter | MLE | SELF | LINEX (s=2) | 95% BCI |
|---|---|---|---|---|
| α | 0.2214 | 0.2181 | 0.2147 | [0.1220, 0.3473] |
| β | 0.0140 | 0.0154 | 0.0154 | [0.0089, 0.0227] |
| λ | 1.001 | 1.177 | 1.154 | [1.0048, 1.6104] |
The acceleration factor λ > 1 is confirmed, though it is close to 1, suggesting a moderate acceleration effect in this dataset.
| File | Contents |
|---|---|
Table1_MLE.csv |
MLE point estimates: AE and MSE for all 14 configurations |
Table2_ACI.csv |
Asymptotic CI: coverage probability (CP) and average length (AL) |
Table3_SELF_Gamma.csv |
Bayes SELF under Gamma prior: AE and MSE |
Table4_LINEX2_Gamma.csv |
Bayes LINEX (s=+2) under Gamma prior |
Table5_LINEXN2_Gamma.csv |
Bayes LINEX (s=−2) under Gamma prior |
Table6_BCI.csv |
Bayesian credible interval: CP and AL |
Table7_SELF_GD.csv |
Bayes SELF under Gamma-Dirichlet prior |
Table8_LINEX2_GD.csv |
Bayes LINEX (s=+2) under GD prior |
Table9_LINEXN2_GD.csv |
Bayes LINEX (s=−2) under GD prior |
RealData_Results.csv |
Real data estimates (Gamma prior) |
RealData_Results_100.csv |
Real data estimates (MCMC with N=100 chains) |
RealData_Results_20.csv |
Real data estimates (MCMC with N=20 chains) |
TracePlot_Alpha.png |
MCMC trace plot for α (real data) |
TracePlot_Beta.png |
MCMC trace plot for β (real data) |
TracePlot_Lambda.png |
MCMC trace plot for λ (real data) |
TracePlot_Alpha_100.png |
Trace for α with 100-iteration burn-in variant |
TracePlot_Beta_100.png |
Trace for β with 100-iteration burn-in variant |
TracePlot_Lambda_100.png |
Trace for λ with 100-iteration burn-in variant |
2201MC30_THESIS.pdf |
Full B.Tech thesis with derivations and analysis |
Estimation Code.R |
Complete R source code |
install.packages(c("stats", "numDeriv", "MASS", "coda", "ggplot2"))source("Estimation Code.R")NSIM <- 100 # number of simulation replications
N_MCMC <- 10000 # total MCMC iterations
BURNIN <- 2500 # iterations to discard
THIN <- 5 # thinning intervalReducing NSIM to 10 and N_MCMC to 2000 gives a fast test run in ~5 minutes.
The full run (NSIM=100, N_MCMC=10000, all 14 settings) takes several hours.
| Section | What it does |
|---|---|
| 1–2 | Load libraries, set seed, global constants |
| 3 | UGR PDF, CDF, survival functions (normal + accelerated) |
| 4 | Inverse CDF for data generation |
| 5–6 | Define 7 censoring schemes and 2 parameter settings |
| 7 | Progressive sample generator |
| 8–9 | Log-likelihood and MLE via optim + Hessian |
| 10–12 | Gamma and GD log-priors, log-posteriors |
| 13 | MH-within-Gibbs MCMC sampler |
| 14–15 | Bayes estimators (SELF, LINEX) and credible intervals |
| 16 | MSE, ABS, CP, AL performance metrics |
| 17–18 | Result data frames + main simulation loop |
| 19 | Export all tables to CSV |
| 20–21 | Real data analysis + trace plots |
From the simulation study (representative findings):
MLE performs well but can be biased for small effective sample sizes (aggressive censoring like CS2). The MSE for λ tends to be larger than for α and β because λ is constrained to (1, ∞) and the boundary at 1.001 is hit in some replications.
Coverage probability for ACI is generally close to 0.95 (range 0.88–0.97 across schemes), with CS6 showing slightly lower coverage for λ.
Bayesian SELF estimates under both priors achieve lower MSE than MLE in most configurations, especially for Setting 1. The GD prior provides marginally tighter estimates than the independent Gamma prior when α and β are both small.
LINEX with s=+2 vs s=−2: When true λ is large (Setting 2, λ=3), underestimation is more common and s=−2 (penalising underestimation) yields lower average loss.
Censoring scheme impact: CS4 and CS2 tend to produce the lowest MSE for α, suggesting that withdrawals concentrated early or at a specific middle point preserve more tail information about the UGR distribution.
Statistical-Inference-main/
│
├── Estimation Code.R ← Main R script (all 21 sections)
│
├── 2201MC30_THESIS.pdf ← Full thesis document
│
├── RealData_Results.csv ← Real data: all estimators
├── RealData_Results_100.csv ← Real data: 100-chain variant
├── RealData_Results_20.csv ← Real data: 20-chain variant
│
├── Table1_MLE.csv ← Simulation: MLE
├── Table1_MLE_Results_100.csv ← MLE with n=100 variant
├── Table1_MLE_Results_20.csv ← MLE with n=20 variant
├── Table2_ACI.csv ← ACI intervals
├── Table2_SELF_Gamma_100.csv
├── Table2_SELF_Gamma_20.csv
├── Table3_SELF_Gamma.csv ← Bayes SELF / Gamma prior
├── Table3_LINEX2_Gamma_100.csv
├── Table3_LINEX2_Gamma_20.csv
├── Table4_LINEX2_Gamma.csv ← Bayes LINEX s=+2 / Gamma
├── Table4_LINEXN2_Gamma_100.csv
├── Table4_LINEXN2_Gamma_20.csv
├── Table5_LINEXN2_Gamma.csv ← Bayes LINEX s=−2 / Gamma
├── Table5_SELF_GD_100.csv
├── Table5_SELF_GD_20.csv
├── Table6_BCI.csv ← Bayesian credible intervals
├── Table7_SELF_GD.csv ← Bayes SELF / GD prior
├── Table8_LINEX2_GD.csv ← Bayes LINEX s=+2 / GD
├── Table9_LINEXN2_GD.csv ← Bayes LINEX s=−2 / GD
│
├── TracePlot_Alpha.png ← MCMC convergence: α
├── TracePlot_Beta.png ← MCMC convergence: β
├── TracePlot_Lambda.png ← MCMC convergence: λ
├── TracePlot_Alpha_100.png
├── TracePlot_Beta_100.png
├── TracePlot_Lambda_100.png
├── TracePlot_Alpha_20.png
├── TracePlot_Beta_20.png
└── TracePlot_Lambda_20.png
| Term | Meaning |
|---|---|
| UGR | Unit Generalized Rayleigh — a distribution on (0,1) |
| CSPALT | Constant Stress Partially Accelerated Life Test |
| MLE | Maximum Likelihood Estimator |
| ACI | Asymptotic Confidence Interval (frequentist) |
| MCMC | Markov Chain Monte Carlo |
| MH | Metropolis-Hastings acceptance-rejection step |
| SELF | Squared Error Loss Function |
| LINEX | Linear-Exponential loss function |
| BCI | Bayesian Credible Interval (equal-tail) |
| GD prior | Gamma-Dirichlet joint prior on (α, β) |
| CP | Coverage Probability |
| AL | Average Length of interval |
| MSE | Mean Squared Error |
| AE | Average Estimate (across simulations) |
| Burn-in | Initial MCMC samples discarded before convergence |
| Thinning | Keeping every k-th MCMC sample to reduce autocorrelation |
B.Tech Thesis — Statistical Inference Project Roll No.: 2201MC30