Skip to content

BorchLab/bHIVE

Repository files navigation

bHIVE

B-cell Hybrid Immune Variant Engine

R-CMD-check Codecov test coverage

Overview

bHIVE is an R package implementing a modular Artificial Immune System (AIS) framework for clustering and classification. Built on AI-Net (de Castro & Von Zuben 2001), bHIVE extends the classical algorithm with biologically-grounded modules drawn from modern immunology: somatic hypermutation, idiotypic network regulation, germinal center selection, and microenvironment-driven adaptation.

Performance-critical operations (affinity/distance matrices, clonal selection, network suppression, mutation) are implemented in C++ via RcppArmadillo, with parallelization support through BiocParallel.

Key Features

  • Two tasks -- clustering and classification on numeric matrices
  • C++ backend -- BLAS-optimized bulk affinity/distance computation
  • Two APIs -- functional (bHIVE()) for quick use and R6 (AINet$new()) for full module composition
  • Multilayer architecture -- honeycombHIVE() for hierarchical prototype refinement across layers
  • Hyperparameter tuning -- swarmbHIVE() with grid search and BiocParallel
  • Gradient refinement -- refineB() post-processing with 5 optimizers and several classification-aware loss functions
  • Composable immune modules -- mix and match biological mechanisms via dependency injection
  • caret compatible -- bHIVEmodel and honeycombHIVEmodel for cross-validation workflows

Installation

devtools::install_github("BorchLab/bHIVE")

Quick Start

Functional API

The simplest way to use bHIVE. Works like any R modeling function:

library(bHIVE)
data(iris)
X <- as.matrix(iris[, 1:4])

# Clustering
res <- bHIVE(X, task = "clustering", nAntibodies = 30, maxIter = 20)
table(res$assignments)

# Classification
res <- bHIVE(X, y = iris$Species, task = "classification",
             nAntibodies = 30, maxIter = 20)
table(Predicted = res$assignments, Actual = iris$Species)

R6 API with Modules

For full control, compose an AINet with any combination of immune modules. Each module is optional and slots into the fit loop at the appropriate biological stage:

# Adaptive mutation + idiotypic network regulation
model <- AINet$new(
  nAntibodies = 20,
  maxIter = 30,
  shm = SHMEngine$new(method = "adaptive", base_rate = 0.1),
  idiotypic = IdiotypicNetwork$new(theta_low = 0.01, theta_high = 0.5),
  verbose = FALSE
)
model$fit(X, iris$Species, task = "classification")
table(model$result$assignments)

# Predict on new data
preds <- model$predict(X[1:10, ])

A richer composition uses microenvironment-aware exploration, Tfh-mediated quality selection, density-driven isotype switching, and a persistent memory pool:

me  <- Microenvironment$new()                              # zone classification
cs  <- ClassSwitcher$new(alpha_IgM = 0.1, alpha_IgG = 5)   # zone -> kernel width
gc  <- GerminalCenter$new(nTfh = 8, selectionPressure = 0.6)
mp  <- MemoryPool$new(archive_threshold = 0.05)            # persists across fits

model <- AINet$new(
  nAntibodies = 30, maxIter = 20,
  shm              = SHMEngine$new(method = "hotspot"),
  microenvironment = me,
  classSwitcher    = cs,            # requires microenvironment
  germinalCenter   = gc,
  memory           = mp,
  verbose          = FALSE
)
model$fit(X, iris$Species, task = "classification")

# Memory persists -- a second fit will recall relevant cells (clustering only)
# and continue archiving high-affinity antibodies.
mp$size()

Architecture

Algorithm

bHIVE evolves a population of antibody vectors to represent structure in data. Each iteration runs a subset of the following stages — modules attach to specific stages and are skipped when absent:

  1. Initialization (once) -- sample from data, random generation, kmeans++, or V(D)J combinatorial assembly via VDJLibrary. If a MemoryPool carries cells from a prior fit, relevant memory is recalled and merged into the starting repertoire (clustering only).
  2. Activation gating -- ActivationGate sets aside antibodies in over-dense neighborhoods or below an affinity floor so clonal selection runs on the sparse subset; gated antibodies rejoin after.
  3. Clonal selection + SHM -- top-k antibodies cloned per data point. Mutation dispatches through SHMEngine (uniform, airs, hotspot, energy, adaptive); the adaptive strategy threads per-antibody Adam-style moment matrices across iterations.
  4. Germinal center selection -- GerminalCenter runs Tfh-mediated quality selection; survivors weighted by clustering compactness or classification purity.
  5. Microenvironment & class switching -- Microenvironment classifies each antibody into stable / explore / boundary zones and applies density-dependent jitter. ClassSwitcher binds each zone to an isotype (IgM/IgG/IgA) and sets the next iteration's kernel width.
  6. Idiotypic regulation -- IdiotypicNetwork runs bell-shaped Ab-Ab dynamics that cull both over-clumped and isolated antibodies. Falls back to a top-population safety net if ill-tuned thresholds would kill the repertoire.
  7. Network suppression -- removes near-duplicate antibodies within an epsilon-ball under the chosen distance metric.
  8. Orphan pruning + final assignment (once) -- antibodies that bind no training point are dropped; remaining cells produce cluster IDs or class predictions. MemoryPool archives high-affinity survivors back into the pool.

Immune Modules

Each module is an R6 class that can be injected into AINet via its constructor. All modules are optional -- use only what you need.

Module Biological Basis What It Does
SHMEngine Somatic hypermutation 5 mutation strategies: uniform, airs, hotspot, energy, adaptive
IdiotypicNetwork Ab-Ab network regulation Bell-shaped activation dynamics replacing epsilon-threshold suppression
GerminalCenter Tfh-B cell interaction Task-aware quality selection with resource competition
Microenvironment Tissue microenvironment cues Density-dependent zone classification and mutation rate modulation
VDJLibrary V(D)J recombination Combinatorial gene library initialization (PCA, cluster, random partition)
ActivationGate Two-signal activation Costimulatory filtering (density, danger signal, or label entropy)
MemoryPool Immunological memory Archive high-affinity antibodies and recall on distribution shift
ClassSwitcher Isotype class switching IgM (broad) / IgG (specific) / IgA (boundary) kernel width modulation
ConvergentSelector Public clonotypes Cross-repertoire consensus for ensemble methods

Multilayer & Tuning

# honeycombHIVE: hierarchical prototype refinement
res <- honeycombHIVE(X, y = iris$Species, task = "classification",
                     layers = 3, nAntibodies = 30,
                     refine = TRUE, refineOptimizer = "adam")

# swarmbHIVE: hyperparameter grid search (parallelizable)
grid <- expand.grid(nAntibodies = c(15, 30), beta = c(3, 5), epsilon = c(0.01, 0.1))
best <- swarmbHIVE(X, y = iris$Species, task = "classification",
                   grid = grid, metric = "accuracy", maxIter = 20)
best$best_params

Gradient Refinement

Fine-tune antibody positions after training with refineB():

res <- bHIVE(X, y = iris$Species, task = "classification",
             nAntibodies = 20, maxIter = 20)

# Adam-based refinement with cross-entropy loss
A_refined <- refineB(res$antibodies, X, y = iris$Species,
                     assignments = res$assignments,
                     task = "classification",
                     loss = "categorical_crossentropy",
                     optimizer = "adam", steps = 10, lr = 0.01)

Affinity & Distance Functions

Affinity Formula Use Case
gaussian exp(-alpha ||x - a||^2) General purpose (default)
laplace exp(-alpha ||x - a||) Heavier tails than Gaussian
polynomial (x . a + c)^p Non-Euclidean similarity
cosine (x . a) / (||x|| ||a||) Direction-based similarity
hamming 1 - (mismatches / d) Categorical/binary features
Distance Notes
euclidean Default; L2 norm
manhattan L1 norm
minkowski Generalized Lp (parameter p)
cosine 1 - cosine similarity
mahalanobis Accounts for feature covariance (requires Sigma)
hamming Count of differing features

Bug Reports / Feature Requests

If you run into any issues or bugs please submit a GitHub issue with details of the issue. If possible please include a reproducible example. Any requests for new features or enhancements can also be submitted as GitHub issues.

Contributing

We welcome contributions to the bHIVE project! To contribute:

  • Fork the repository.
  • Create a feature branch (git checkout -b feature-branch).
  • Commit your changes (git commit -m "Add new feature").
  • Push to the branch (git push origin feature-branch).
  • Open a pull request.

About

B-cell Hybrid Immune Variant Engine

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors