SMGP — Spectral Memory Graph Processor

Persistent, hallucination-free AI reasoning through spectral graph theory, hyperdimensional computing, and dedicated hardware acceleration.

SMGP is a novel AI system that combines spectral graph theory, topological data analysis (TDA), hyperdimensional computing (HDC), and category-theoretic graph rewriting to achieve persistent memory, sublinear context processing, and verifiable reasoning. The framework ships with a complete Python software stack and an open-source hardware accelerator design (SMGPU) in SystemVerilog, enabling transparent offloading of compute-intensive kernels to an FPGA.

Why SMGP?

Modern LLMs suffer from three fundamental limitations:

No persistent memory — every conversation starts from scratch; facts learned in one session are forgotten in the next.
O(N^2) attention — processing long contexts is prohibitively expensive, limiting effective context windows.
Hallucination — models generate plausible-sounding but factually incorrect claims without any mechanism for verification.

SMGP addresses all three by encoding knowledge in a spectral memory graph — a persistent, queryable knowledge structure where:

Every fact is a node addressed by a hyperdimensional vector (bipolar, 10,000-dimensional).
Relationships are edges typed with HD-encoded relations.
Attention is computed via graph Fourier analysis in O(N log N) time.
Claims are verified against graph paths before being surfaced to the user.

On top of the software framework, SMGP provides SMGPU — a dedicated hardware accelerator in SystemVerilog that offloads spectral transforms, HD operations, topological persistence, and graph rewriting to an FPGA. The Python stack can seamlessly route operations to hardware via a transparent HAL, achieving 10-100x speedup on supported kernels without changing application code.

Features

Core (Software)

Feature	Description
SpectralMemoryGraph	Typed property graph with hyperdimensional node addressing
HyperdimensionalMemory	Bipolar HD vectors with bind/unbind/query operations
SpectralMethods	Graph Laplacian, eigendecomposition, Fourier analysis
TopologicalAnalyzer	Persistent homology for controlled memory pruning
GraphRewriter	Category-theoretic DPO graph rewriting

Memory & Attention

Feature	Description
MemoryStore	Thread-safe persistent storage with LRU eviction
AssociativeMemory	Content-addressable retrieval by HD similarity
MemoryLifecycle	Persistence-based forgetting and consolidation
SpectralAttention	O(N log N) multiscale attention via graph Fourier

Reasoning

Feature	Description
ClaimVerifier	Path-based claim verification against knowledge graph
NeuroSymbolicPlanner	Chain-of-thought planning with graph-grounded facts

Integration

Feature	Description
HuggingFace	`SMGPForCausalLM` — drop-in replacement for `AutoModel`
LangChain	`SMGPMemory` and `SMGPVerifierTool` for agent pipelines
REST API	FastAPI server with graph, query, verify, and reason endpoints

Hardware Accelerator (SMGPU)

Feature	Description
Spectral Engine	16x16 systolic array for Laplacian, eigen-decomp, Chebyshev convolution, GFT
HD Engine	16 parallel banks for bind/unbind (XOR), bundle (majority), similarity (popcount)
Topology Engine	Union-Find persistent homology, barcode streaming, Wasserstein stability
Rewrite Engine	DPO subgraph matching with backtracking and proof trace
Memory Subsystem	Associative CAM cache (256 entries), HBM2 controller (4-ch, 256-bit), memristor crossbar (1024x1024 analog MVM)
Interconnect	2D mesh NoC (5-port, XY routing, 2 VCs), graph-aware scatter-gather DMA
ISA	32-bit instruction set with 8 opcode classes, flag-based chaining and streaming
Python HAL	Transparent HWExecutor backend, MemoryMapper for host-to-device transfers

Repository Structure

smgp/
  README.md                          # This file
  PACKAGE.md                         # Package-specific README (used on PyPI)
  LICENSE                            # Apache 2.0
  CITATION.cff                       # Citation metadata
  CHANGELOG.md                       # Version changelog
  CODE_OF_CONDUCT.md                 # Community code of conduct
  MANIFEST.in                        # sdist file inclusion rules
  pyproject.toml                     # Python package & tool configuration
  setup.cfg                          # Setuptools config
  pytest.ini                         # Test runner config

  src/smgp/                          # Python source
    __init__.py                      # Package init
    config.py                        # YAML/JSON/env configuration
    cli.py                           # Command-line interface tools
    hw_bridge.py                     # Hardware executor bridge (HAL ↔ core)
    core/                            # Core data structures
      graph.py                       #   SpectralMemoryGraph
      spectral.py                    #   SpectralMethods (Laplacian, GFT, Chebyshev)
      hyperdim.py                    #   HyperdimensionalMemory
      topology.py                    #   TopologicalAnalyzer (persistent homology)
      category.py                    #   GraphRewriter (DPO)
      distributed.py                 #   Distributed memory graph support
    memory/                          # Memory subsystem
      store.py                       #   MemoryStore (key-value persistence)
      associative.py                 #   O(1) HD associative recall
      lifecycle.py                   #   Persistence-based forgetting
      enhanced_lifecycle.py          #   Enhanced lifecycle with graph pruning
      prune_policies.py              #   Graph pruning & compression policies
      event_log.py                   #   Event-sourced history log
    attention/                       # Attention mechanisms
      spectral_attn.py               #   O(N log N) multiscale spectral attention
      streaming_attention.py         #   Streaming spectral attention
    reasoning/                       # Neuro-symbolic reasoning
      verifier.py                    #   ClaimVerifier (path-based)
      planner.py                     #   NeuroSymbolicPlanner (DPO rewrite search)
      explainable.py                 #   Explainability & audit trails
    integration/                     # External integrations
      huggingface.py                 #   SMGPForCausalLM
      langchain.py                   #   SMGPMemory, SMGPVerifierTool
      api.py                         #   FastAPI REST server
      api_auth.py                    #   REST API auth & rate limiting
      onnx_vllm.py                   #   vLLM / ONNX backend
      federation.py                  #   External knowledge base federator
      vectordb.py                    #   Vector database bridge
    utils/                           # Utilities
      io.py                          #   Graph save/load
      bench.py                       #   Benchmarking CLI
      multimodal.py                  #   Multi-modal graph embeddings
      tuner.py                       #   Auto-tuning
    enhanced/
      speed/                         #   C/C++ acceleration backend (Cython)

  hardware/                          # SMGPU hardware accelerator
    README_HW.md                     # Hardware-specific README
    rtl/                             # SystemVerilog 2017 RTL
      isa/smgp_isa_pkg.sv            #   ISA encoding, opcodes, flags
      lib/                           #   Shared arithmetic packages
      compute/                       #   Spectral, HD, topology, graph-rewrite engines
      memory/                        #   CAM, HBM controller, memristor crossbar
      interconnect/                  #   Scatter-gather DMA, 2D mesh NoC router
      top/                           #   smgp_core.sv / smgp_system.sv
    sim/                             # Simulation & verification
      scripts/run_verilator.sh       #   Lint & simulation driver (6/6 TB pass)
      models/                        #   Python golden models
      testbench/                     #   6 SystemVerilog testbenches
    sw/smgp_hal/                     # Python HAL
      hw_session.py                  #   Hardware session management
      executor.py                    #   HWExecutor drop-in backend
      memory_mapper.py               #   Host-to-device memory mapping
      cycle_simulator.py             #   Cycle-accurate simulator
    fpga/build/                      # FPGA build (Vivado)
      Makefile                       #   synth / impl / xclbin / bit_script targets
      vivado_project.tcl             #   Vivado project generator
      scripts/                       #   write_xclbin.tcl, write_bitstream.tcl
    fpga/constraints/                # Pin & timing constraints (U280, ZedBoard, Artix)
    driver/                          # C PCIe driver + smgp_hw_lib
    asic/                            # ASIC synthesis (OpenLane)
    docs_hw/                         # Architecture, ISA reference, integration guide

  tests/                             # Python test suite (112 tests)
    test_graph.py
    test_spectral.py
    test_hyperdim.py
    test_topology.py
    test_category.py
    test_memory_store.py
    test_spectral_attn.py
    test_verifier.py
    test_planner.py
    test_end_to_end.py
    test_huggingface_integration.py
    test_langchain_integration.py
    test_hardware_integration_wired.py  # Hardware executor wiring (mock-based)

  tests_enhanced/                    # Enhanced test suite (123 tests)
    test_hd_hypothesis.py            #   Property-based tests (Hypothesis)
    test_cli.py
    test_distributed.py
    test_event_sourcing.py
    test_multimodal.py
    test_onnx_vllm.py
    test_streaming_attention.py
    test_vectordb.py
    test_federation.py
    test_prune_policies.py
    test_explainability.py

  benchmarks/                        # ASV performance benchmarks
  enhancements/                      # Enhancement scripts & CI workflows
  notebooks/                         # Jupyter notebooks & tutorials
  examples/                          # Runnable usage examples
  docs/                              # Documentation (quickstart, API reference)

  .github/workflows/                 # CI/CD
    tests.yml                        #   Python tests (Ubuntu + macOS, 3.10–3.12)

  ROADMAP.md                         # Project roadmap (not in sdist)
  RESEARCH.md                        # Mathematical foundations (not in sdist)

Enhancements

The following 25 enhancements extend SMGP across software performance, hardware support, infrastructure, and community. Test results: 230 passed, 5 skipped, 0 failures (112 in tests/ + 123 in tests_enhanced/).

Software (12)

#	Enhancement	Description	Key Files
1	C/C++ Acceleration Backend	Cython-based speed extensions with pure-Python fallback wrapper	`src/smgp/enhanced/speed/*.pyx`, `setup_speed.py`
2	Graph Pruning & Compression	Persistence-based pruning policies and enhanced lifecycle management	`src/smgp/memory/prune_policies.py`, `enhanced_lifecycle.py`
3	Streaming Spectral Attention	Online attention for token streams without full recomputation	`src/smgp/attention/streaming_attention.py`
4	External Knowledge Base Federator	Federated querying across multiple external knowledge bases	`src/smgp/integration/federation.py`
5	Multi-Modal Graph Embeddings	Graph node embeddings from text, image, and audio inputs	`src/smgp/utils/multimodal.py`
6	Auto-Tuning	Automatic hyperparameter optimisation for graph operations	`src/smgp/utils/tuner.py`
7	Explainability & Audit Trails	Human-readable reasoning traces and decision audit logging	`src/smgp/reasoning/explainable.py`
8	vLLM / ONNX Backend	Inference acceleration via vLLM serving and ONNX Runtime	`src/smgp/integration/onnx_vllm.py`
9	Vector Database Bridge	Connect SMGP graphs to external vector stores (FAISS, Chroma, etc.)	`src/smgp/integration/vectordb.py`
10	REST API Auth & Rate Limiting	Token-based authentication and per-client rate limiting	`src/smgp/integration/api_auth.py`
11	Distributed Memory Graph	Partitioned graph storage across multiple nodes for scale-out	`src/smgp/core/distributed.py`
12	Event-Sourced History Log	Immutable append-only log of all graph mutations	`src/smgp/memory/event_log.py`

Infrastructure (5)

#	Enhancement	Description	Key Files
13	Property-Based Testing	Hypothesis-driven fuzz tests for HD vectors and graph operations	`tests_enhanced/test_hd_hypothesis.py`
14	ASV Benchmarks	Airspeed Velocity regression benchmarks for CI tracking	`benchmarks/`
15	Mutation Testing	CI workflow for mutation testing to maximise test coverage	`enhancements/workflows/mutation_testing.yml`
16	CLI Tools	Command-line interface for graph management and benchmarking	`src/smgp/cli.py`
17	Jupyter Notebooks	Three interactive notebooks for tutorials and demonstrations	`notebooks/*.ipynb`

Hardware (4)

#	Enhancement	Description	Key Files
18	Pre-Built Wheels	CI workflow to publish pre-built wheels for major platforms	`enhancements/workflows/wheels.yml`
19	Cycle-Accurate Simulator	Instruction-level cycle model for pre-silicon performance analysis	`hardware/sim/cycle_model/`, `hardware/sw/smgp_hal/cycle_simulator.py`
20	Additional FPGA Boards	Constraint files and build scripts for Arty A7, Alveo U250, VCK190	`hardware/fpga/constraints/arty_a7.xdc`, `alveo_u250.xdc`, `vck190.xdc`
21	ASIC Synthesis (OpenLane)	Open-source ASIC flow targeting SkyWater 130nm / GF 12nm	`hardware/asic/`
22	PYNQ Backend	Deploy SMGPU overlays on Xilinx Zynq via PYNQ framework	`hardware/sw/pynq/`

Community (3)

#	Enhancement	Description	Key Files
23	Community Files	ROADMAP, CHANGELOG, CODE_OF_CONDUCT, issue/PR templates	`ROADMAP.md`, `CHANGELOG.md`, `CODE_OF_CONDUCT.md`, `.github/`
24	Research Paper	Accompanying academic paper describing SMGP's architecture	`PAPER.md`
25	PyPI Patch	Patch for upstream PyPI packaging and deployment fixes	`pyproject.patch`, `enhancements/apply_patch.sh`

Installation

From PyPI

pip install smgp

With optional dependencies

# Topological data analysis (persistent homology)
pip install "smgp[topology]"

# REST API server + HTTP client
pip install "smgp[integration]"

# JWT & rate-limiting auth layer
pip install "smgp[auth]"

# HuggingFace Transformers integration (requires Python ≤3.12, numpy<2.0)
pip install "smgp[huggingface]"

# LangChain integration
pip install "smgp[langchain]"

# Vector DB bridge (Qdrant, Pinecone, Weaviate) + federation (SQLAlchemy)
pip install "smgp[vectordb]"

# ONNX export (requires Python ≤3.12, numpy<2.0)
pip install "smgp[onnx]"

# Everything
pip install "smgp[all]"

Note — PyTorch compatibility: smgp[huggingface] and smgp[onnx] require Python ≤ 3.12 and numpy<2.0 because PyTorch 2.x does not yet publish wheels for Python 3.13+. All other extras work on Python 3.10–3.14.

From source

git clone https://github.com/rotsl/smgp.git
cd smgp

# Standard dev environment (Python 3.10+)
python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev,topology,integration,auth,vectordb]"

# PyTorch-enabled environment (Python 3.12 recommended)
python3.12 -m venv .venv-torch && source .venv-torch/bin/activate
pip install -e ".[dev,topology,integration,auth,vectordb,huggingface,onnx]"

Hardware simulation prerequisites

# Verilator 5.x (RTL lint & simulation)
apt install verilator  # or build from source

# Cocotb (Python-based testbenches)
pip install cocotb

# Vivado 2023.2+ (FPGA synthesis, Xilinx license required)

Quick Start

Software

from smgp.core.graph import SpectralMemoryGraph

# Create a knowledge graph with 10,000-dimensional HD addressing
graph = SpectralMemoryGraph(hd_dim=10000, seed=42)

# Add nodes with labels and properties
graph.add_node("Socrates", label="person", properties={"era": "ancient Greece"})
graph.add_node("Plato", label="person", properties={"era": "ancient Greece"})
graph.add_node("philosophy", label="field", properties={"domain": "humanities"})

# Add typed edges
graph.add_edge("Socrates", "Plato", "taught")
graph.add_edge("Socrates", "philosophy", "studied")
graph.add_edge("Plato", "philosophy", "studied")

# Query by hyperdimensional similarity
query = graph.get_node("Socrates")["vector"]
similar = graph.query_similar(query, k=3)
for node_id, similarity in similar:
    print(f"  {node_id}: similarity={similarity:.4f}")

# Verify claims against the knowledge graph
from smgp.reasoning.verifier import ClaimVerifier
verifier = ClaimVerifier(graph)
result = verifier.verify("Socrates taught Plato")
print(f"Verified: {result['verified']}, Reasoning: {result['reasoning']}")

Hardware-accelerated

from smgp.core.graph import SpectralMemoryGraph
from hardware.sw.smgp_hal.executor import HWExecutor

# Connect to FPGA and create a hardware-backed graph
executor = HWExecutor(device="/dev/smgpu0", fallback=True)
graph = SpectralMemoryGraph(hd_dim=10000, seed=42, executor=executor)

# All operations now transparently offload to FPGA
graph.add_node("Socrates", label="person")
graph.add_edge("Socrates", "Plato", "taught")

from smgp.core.spectral import SpectralMethods
spectral = SpectralMethods(graph, num_eigenvalues=8)
L = spectral.compute_laplacian(normalized=True)  # runs on FPGA
eigenvalues, eigenvectors = spectral.compute_eigen()  # runs on FPGA

Usage Examples

Knowledge Graph Construction

from smgp.core.graph import SpectralMemoryGraph

graph = SpectralMemoryGraph(hd_dim=10000, seed=42)

# Add nodes — each gets a unique hyperdimensional address vector
graph.add_node("SMGP", label="system", properties={"version": "0.1.0"})
graph.add_node("graph_theory", label="field")
graph.add_node("spectral_analysis", label="technique")
graph.add_node("persistent_memory", label="feature")

# Add typed edges — relation types are HD-encoded for fast matching
graph.add_edge("SMGP", "graph_theory", "based_on")
graph.add_edge("SMGP", "spectral_analysis", "uses")
graph.add_edge("SMGP", "persistent_memory", "provides")

print(f"Graph: {graph.num_nodes} nodes, {graph.num_edges} edges")

for node_id in graph.nodes:
    node = graph.get_node(node_id)
    print(f"  {node_id}: label={node['label']}, props={node['properties']}")

Spectral Analysis

from smgp.core.graph import SpectralMemoryGraph
from smgp.core.spectral import SpectralMethods

graph = SpectralMemoryGraph(hd_dim=1000, seed=42)

# Build a small graph
for i in range(10):
    graph.add_node(f"node_{i}", label="entity")
for i in range(9):
    graph.add_edge(f"node_{i}", f"node_{i+1}", "connects")
graph.add_edge("node_0", "node_5", "connects")  # long-range connection

# Compute spectral properties
spectral = SpectralMethods(graph, num_eigenvalues=5)

L = spectral.compute_laplacian(normalized=True)
print(f"Laplacian shape: {L.shape}")

eigenvalues, eigenvectors = spectral.compute_eigen()
print(f"Eigenvalues: {eigenvalues}")
print(f"Eigenvectors shape: {eigenvectors.shape}")

# Graph Fourier transform of a signal
signal = [1.0 if i % 2 == 0 else 0.0 for i in range(graph.num_nodes)]
fourier_coeffs = spectral.graph_fourier_transform(signal)
print(f"Fourier coefficients: {fourier_coeffs}")

# Spectral clustering
labels = spectral.spectral_clustering(n_clusters=3)
print(f"Cluster labels: {labels}")

Hyperdimensional Memory

from smgp.core.hyperdim import HyperdimensionalMemory

# Create HD memory with 10,000-dimensional bipolar vectors
hd = HyperdimensionalMemory(dim=10000, seed=42)

# Generate random HD vectors
vectors = hd.generate(5)
print(f"Generated {len(vectors)} vectors of dimension {hd.dim}")

# Bind (associative) — analogous to key-value pairing
v1 = hd.generate(1)[0]
v2 = hd.generate(1)[0]
bound = hd.bind(v1, v2)

# Unbind (retrieval) — recover v1 from the bound pair using v2
recovered = hd.unbind(bound, v2)
similarity = hd.similarity(v1, recovered)
print(f"Bind/Unbind recovery similarity: {similarity:.6f}")  # Should be ~1.0

# Bundle (superposition) — combine multiple vectors
v3 = hd.generate(1)[0]
v4 = hd.generate(1)[0]
bundled = hd.bundle([v3, v4])

# Similarity search
candidates = hd.generate(100)
query = candidates[42].copy()
similar = hd.similarity_search(query, candidates, k=5)
print(f"Top-5 matches for query index 42: {[s[0] for s in similar]}")

# HD encode a string
vec = hd.encode_string("Socrates taught Plato")
print(f"Encoded 'Socrates taught Plato' into vector of dim {len(vec)}")

Claim Verification

from smgp.core.graph import SpectralMemoryGraph
from smgp.reasoning.verifier import ClaimVerifier

graph = SpectralMemoryGraph(hd_dim=1000, seed=42)

# Build a knowledge base
graph.add_node("Socrates", label="person")
graph.add_node("Plato", label="person")
graph.add_node("Aristotle", label="person")
graph.add_node("philosophy", label="field")
graph.add_node("Academy", label="institution")

graph.add_edge("Socrates", "philosophy", "studied")
graph.add_edge("Socrates", "Plato", "taught")
graph.add_edge("Plato", "philosophy", "studied")
graph.add_edge("Plato", "Aristotle", "taught")
graph.add_edge("Plato", "Academy", "founded")

# Create a verifier
verifier = ClaimVerifier(graph)

# Verify individual claims
claims = [
    "Socrates taught Plato",
    "Plato founded Academy",
    "Socrates founded Academy",     # False — Plato did
    "Aristotle taught Socrates",    # False — reverse direction
]

for claim in claims:
    result = verifier.verify(claim)
    status = "VERIFIED" if result["verified"] else "UNVERIFIED"
    print(f"  [{status}] {claim}")
    print(f"    Reasoning: {result['reasoning']}")

Neuro-Symbolic Planning

from smgp.core.graph import SpectralMemoryGraph
from smgp.reasoning.planner import NeuroSymbolicPlanner

graph = SpectralMemoryGraph(hd_dim=1000, seed=42)

graph.add_node("patient", label="entity", properties={"condition": "fever"})
graph.add_node("aspirin", label="medication")
graph.add_node("fever_reduction", label="outcome")
graph.add_edge("aspirin", "fever_reduction", "causes")
graph.add_edge("patient", "aspirin", "can_take")

planner = NeuroSymbolicPlanner(graph)

plan = planner.plan(
    query="How to treat fever?",
    max_depth=3,
    max_branching=2
)

print("Reasoning Plan:")
for step in plan["steps"]:
    print(f"  Step {step['step']}: {step['action']}")
    print(f"    Evidence: {step.get('evidence', 'N/A')}")
    print(f"    Confidence: {step.get('confidence', 'N/A')}")

print(f"\nConclusion: {plan.get('conclusion', 'N/A')}")
print(f"Confidence: {plan.get('confidence', 'N/A')}")

Long-Context Processing

import numpy as np
from smgp.core.graph import SpectralMemoryGraph
from smgp.attention.spectral_attn import SpectralAttention

# Simulate a long sequence (e.g., 1024 tokens with 128-dim embeddings)
seq_len = 1024
hidden_dim = 128
np.random.seed(42)
tokens = np.random.randn(seq_len, hidden_dim) * 0.1

# Build a graph and spectral attention mechanism
graph = SpectralMemoryGraph(hd_dim=100, seed=42)
attn = SpectralAttention(
    graph,
    hidden_dim=hidden_dim,
    num_heads=8,
    num_scales=4,
)

# Build a context graph from the token sequence
attn.build_graph_from_tokens(tokens)
print(f"Context graph: {graph.num_nodes} nodes, {graph.num_edges} edges")

# Run spectral attention (O(N log N) instead of O(N^2))
output = attn.forward(tokens)
print(f"Output shape: {output.shape}")

# Hierarchical coarsening for multi-scale processing
levels = attn.hierarchical_coarsening()
print(f"\nHierarchical coarsening:")
for i, level in enumerate(levels):
    print(f"  Level {i}: {level.num_nodes} nodes")

Architecture Overview

SMGP is a full-stack system spanning software and hardware:

+=============================================================+
|                    Application Layer                        |
|   (LLM, Knowledge Bases, Reasoning Agents, Chatbots)        |
+---------------------------+---------------------------------+
                            |
+=============================================================+
|                  Integration Layer                          |
|  HuggingFace | LangChain | FastAPI | CLI | Python HAL       |
+---------------------------+---------------------------------+
                            |
+=============================================================+
|                  Reasoning Layer                            |
|  ClaimVerifier | NeuroSymbolicPlanner (DPO rewrite search)  |
+---------------------------+---------------------------------+
                            |
+===========================+=================================+
|              Memory & Attention Layer                       |
|  MemoryStore | AssociativeMemory | MemoryLifecycle          |
|  SpectralAttention (O(N log N) multiscale)                  |
+===========================+=================================+
                            |
+===========================+=================================+
|                    Core Layer                               |
|  SpectralMemoryGraph | HyperdimensionalMemory               |
|  SpectralMethods | TopologicalAnalyzer | GraphRewriter      |
+=============================================================+
                            |
              +=============+============+
              |  SOFTWARE (Python/CPU)    |
              |  NumPy / SciPy / GUDHI    |
              +-------------+-------------+
                            |
              +=============+=============+==============+
              |    HARDWARE (SMGPU FPGA)                 |
              |   Spectral | HD | Topo | Rewrite Engines |
              |   NoC | HBM | CAM | Memristor Crossbar   |
              +===========================================+

Software Stack

+---------------------------------------------+
|           Integration Layer                 |
|  (HuggingFace, LangChain, FastAPI, CLI)     |
+---------------------------------------------+
|              Reasoning Layer                |
|  (ClaimVerifier, NeuroSymbolicPlanner)      |
+---------------------------------------------+
|        Memory & Attention Layer             |
|  (MemoryStore, AssociativeMemory,           |
|   MemoryLifecycle, SpectralAttention)       |
+---------------------------------------------+
|               Core Layer                    |
|  (SpectralMemoryGraph, HyperdimensionalMem, |
|   SpectralMethods, TopologicalAnalyzer,     |
|   GraphRewriter)                            |
+---------------------------------------------+
|            Utils & Config                   |
|  (IO, Benchmarking, YAML/JSON config)       |
+---------------------------------------------+

Module Map

Module	Path	Description
`core.graph`	`smgp.core.graph`	SpectralMemoryGraph — the central data structure
`core.hyperdim`	`smgp.core.hyperdim`	HyperdimensionalMemory — HD vector operations
`core.spectral`	`smgp.core.spectral`	SpectralMethods — Laplacian, eigendecomposition
`core.topology`	`smgp.core.topology`	TopologicalAnalyzer — persistent homology
`core.category`	`smgp.core.category`	GraphRewriter — category-theoretic DPO
`memory.store`	`smgp.memory.store`	MemoryStore — persistent thread-safe storage
`memory.associative`	`smgp.memory.associative`	AssociativeMemory — content-addressable retrieval
`memory.lifecycle`	`smgp.memory.lifecycle`	MemoryLifecycle — forgetting and consolidation
`attention.spectral_attn`	`smgp.attention.spectral_attn`	SpectralAttention — O(N log N) attention
`reasoning.verifier`	`smgp.reasoning.verifier`	ClaimVerifier — path-based verification
`reasoning.planner`	`smgp.reasoning.planner`	NeuroSymbolicPlanner — chain-of-thought
`integration.huggingface`	`smgp.integration.huggingface`	SMGPForCausalLM — HF model wrapper
`integration.langchain`	`smgp.integration.langchain`	SMGPMemory, SMGPVerifierTool
`integration.api`	`smgp.integration.api`	FastAPI REST endpoints
`utils.io`	`smgp.utils.io`	Graph save/load utilities
`utils.bench`	`smgp.utils.bench`	Benchmarking CLI

Hardware Accelerator (SMGPU)

                          +---------------------------+
                          |       PCIe / AXI Host     |
                          +-------------+-------------+
                                        |
                          +-------------v-------------+
                          |        ISA Decoder        |
                          |   (32-bit instruction     |
                          |    fetch & dispatch)      |
                          +------+------+------+-------+
                                 |      |      |
          +----------------------+      |      +----------------------+
          |                             |                             |
  +--------v---------+            +--------v----------+           +--------v--------+
 |  Spectral Engine  |          |    HD Engine        |          | Topology Engine  |
 |  - Laplacian      |          |  - Bundle (Maj)     |          |  - Union-Find    |
 |  - Eigen-decomp   |          |  - Bind/Unbind(XOR) |          |  - Barcode emit  |
 |  - Chebyshev      |          |  - Permute (shift)  |          |  - Wasserstein   |
 |  - GFT / Wavelet  |          |  - Similarity       |          |  - Stability chk |
 |  (16x16 systolic) |          |  (16 parallel banks)|          |                  |
 +--------+----------+          +---------+----------+          +--------+---------+
          |                             |                             |
 +--------v---------+            +---------v----------+          +---------v--------+
 |  Rewrite Engine   |          |  Associative Cache  |          |  Memristor       |
 |  - DPO match      |<-------->|  (CAM, 256 entries) |<-------->|  Crossbar Array  |
 |  - DPO apply      |          |  - O(1) recall      |          |  - 1024x1024     |
 |  - Proof trace    |          |  - 1024-dim HD keys |          |  - Analog MVM    |
 +--------+----------+          +---------------------+          +------------------+
          |
 +--------v----------+     +-------------+-------------+      +------------------+
 |    Graph DMA      |<--->|   2D Mesh NoC Router      | <--->|   HBM Controller  |
 |  (scatter-gather) |     |   (5-port, XY routing)    |      |   (4-ch, 256-bit) |
 +-------------------+      +-------------------------+       +------------------+
                                                              |
                                                     +--------v--------+
                                                     |  HBM2 (external) |
                                                     +-----------------+

Compute Engines

Engine	Architecture	Key Operations	Pipeline Depth
Spectral	16x16 systolic array	Laplacian, eigen-decomp, Chebyshev conv, GFT, wavelets	4 stages
HD	16 parallel banks	Bundle (majority), bind/unbind (XOR), permute, similarity (popcount)	1-20 cycles
Topology	Union-Find with parallel prefix	Filtration, persistent homology, barcode, Wasserstein	6 stages
Rewrite	Backtracking search FSM	DPO pattern match, rule apply, proof trace	10-state FSM

Memory Subsystem

Component	Specification
Associative Cache	256-entry CAM with 1024-dim projected keys, O(N) parallel lookup
HBM2 Controller	4 channels, 256-bit data width, burst-16, AXI4-Stream
Memristor Crossbar	1024x1024 conductance cells, 8-bit resolution, analog MVM

Interconnect

Component	Specification
NoC	2D mesh, 5-port routers, XY deterministic routing, 2 virtual channels
DMA	Scatter-gather, 256-entry descriptor queue, CSR-aware graph traversal

Design Principles

Dataflow-oriented execution — instructions specify data streams between engines rather than scalar register operands.
Fixed-point dominance — all arithmetic uses Q8.24 (default) instead of IEEE 754 FP, halving DSP utilisation.
Heterogeneous specialisation — each engine is hand-tuned for its target workload class.
Parameterised scaling — systolic array size, HD banks, NoC mesh dimensions, and cache depth are all SystemVerilog parameters retargetable at elaboration time.

Configuration

SMGP can be configured via YAML, JSON, environment variables, or direct Python.

YAML Configuration

Create a smgp_config.yaml:

hd_dim: 10000
seed: 42
graph:
  normalize_embeddings: true
  edge_weight_default: 1.0
spectral:
  num_eigenvalues: 64
  laplacian_type: normalized
topology:
  max_dimension: 2
  persistence_threshold: 0.1
attention:
  num_heads: 8
  num_scales: 4
  temperature: 1.0
reasoning:
  max_depth: 5
  confidence_threshold: 0.7

Load Configuration

from smgp.config import SMGPConfig

# From YAML file
config = SMGPConfig.from_yaml("smgp_config.yaml")

# From dictionary
config = SMGPConfig.from_dict({
    "hd_dim": 10000,
    "seed": 42,
    "spectral": {"num_eigenvalues": 64},
})

# From environment variables (prefix: SMGP_)
# export SMGP_HD_DIM=10000
# export SMGP_SEED=42
config = SMGPConfig.from_env()

# Direct construction
config = SMGPConfig(hd_dim=10000, seed=42)

# Use with graph
from smgp.core.graph import SpectralMemoryGraph
graph = SpectralMemoryGraph.from_config(config)

Integrations

HuggingFace Transformers

Use SMGP as a drop-in HuggingFace model:

from smgp.integration.huggingface import SMGPForCausalLM

model = SMGPForCausalLM(
    graph_hd_dim=10000,
    graph_seed=42,
)

model.add_knowledge("Socrates", "person")
model.add_knowledge("Socrates", "Plato", "taught")

output = model.generate(
    prompt="Who did Socrates teach?",
    max_length=50,
    verify_claims=True,
)
print(output)

LangChain

Integrate SMGP as LangChain memory and tools:

from smgp.integration.langchain import SMGPMemory, SMGPVerifierTool

# Use as conversation memory
memory = SMGPMemory(hd_dim=10000, seed=42)

memory.save_context(
    {"input": "Socrates was a Greek philosopher"},
    {"output": "Yes, Socrates is known as one of the founders of Western philosophy."},
)

results = memory.load_memory_variables({"query": "Greek philosophers"})
print(results)

# Use as a verification tool
verifier = SMGPVerifierTool(graph=memory.graph)
result = verifier.run("Socrates taught Plato")
print(result)

REST API

Start the API server:

pip install "smgp[integration]"
python -m smgp.integration.api

Available endpoints:

Method	Endpoint	Description
`GET`	`/health`	Health check
`POST`	`/graph/nodes`	Add a node
`POST`	`/graph/edges`	Add an edge
`GET`	`/graph/nodes/{id}`	Get a node
`POST`	`/query`	Query similar nodes
`POST`	`/verify`	Verify a claim
`POST`	`/reason`	Generate reasoning plan
`GET`	`/stats`	Graph statistics

Example usage with curl:

curl -X POST http://localhost:8000/graph/nodes \
  -H "Content-Type: application/json" \
  -d '{"id": "Socrates", "label": "person"}'

curl -X POST http://localhost:8000/verify \
  -H "Content-Type: application/json" \
  -d '{"claim": "Socrates taught Plato"}'

Hardware Acceleration

Hardware Architecture

SMGPU is a domain-specific accelerator that offloads the four core computational pillars of SMGP from the host CPU to dedicated FPGA hardware. The design centres on four heterogeneous compute engines connected through a 2D mesh Network-on-Chip and backed by a high-bandwidth memory subsystem with an optional memristor crossbar for analog in-memory computing.

Target platform: Xilinx Alveo U280 (xcu280-fsvh2892-2L-e) with 8 GB HBM2, 250 MHz core clock.

ISA Overview

All SMGPU instructions are 32-bit fixed-width:

 [31:28] opcode      — Operation class (NOP, GRAPH_CTOR, SPECTRAL, HD, TOPOLOGY, REWRITE, MEMORY, SYSTEM)
 [27:24] sub_opcode  — Sub-operation within the class
 [23:16] flags       — Modifier flags (START, DONE_IRQ, CHAINED, STREAMING, BLOCKED, PRECISION_Q)
 [15:0]  operand     — Address, immediate, or register select

There are 8 primary opcode classes with 3-7 sub-opcodes each, supporting graph construction (add/delete/query nodes and edges), spectral transforms (Laplacian, eigen-decomp, Chebyshev, GFT, wavelets), HD operations (bundle, bind, unbind, permute, similarity, associative read/write, generate), topology (filtration, persistence homology, Wasserstein, stability check, persistence pruning), graph rewriting (DPO match/apply/verify, pattern load), memory management (load/store, flush), and system control (config, start, halt, status, IRQ, reset).

See hardware/docs_hw/isa_reference.md for the complete encoding tables and examples.

Python HAL

The Python Hardware Abstraction Layer provides a software-side interface that mirrors the SMGP Python API but routes operations to the hardware backend:

from hardware.sw.smgp_hal.hw_session import SMGPU_HAL

# Connect to the FPGA via PCIe
hal = SMGPU_HAL(device="/dev/smgpu0")

# Load a graph into HBM
graph_id = hal.load_graph(nodes=1024, edges=8192, adj_matrix=csr_data)

# Run spectral analysis
result = hal.execute(
    opcode=hal.OPC_SPECTRAL,
    sub_opcode=hal.SUB_COMPUTE_LAP,
    flags=hal.FLAG_DONE_IRQ,
    operand=graph_id,
)
laplacian = result.read_buffer(1024)

# Run HD similarity query
query_vec = hal.encode_hd("Socrates taught Plato")
sim_result = hal.hd_similarity(query_vec, k=5)

The HWExecutor provides a drop-in backend that can be passed to any SMGP graph constructor or explicitly to individual components. All modules fall back to pure Python when no executor is supplied.

from smgp.core.graph import SpectralMemoryGraph
from smgp.core.spectral import SpectralMethods
from smgp.core.hyperdim import HyperdimensionalMemory
from smgp.attention.spectral_attn import SpectralAttention
from smgp.config import SMGPConfig, create_executor_from_config

# ── Option A: pass executor directly ────────────────────────────────────────
from hardware.sw.smgp_hal.executor import HWExecutor
from hardware.sw.smgp_hal.hw_session import HWSession

session  = HWSession(device="/dev/smgpu0", fallback=True)
executor = HWExecutor(session=session)

graph   = SpectralMemoryGraph(hd_dim=10000, seed=42, executor=executor)
spectral = SpectralMethods(graph)          # inherits executor from graph
hd      = HyperdimensionalMemory(dim=10000, executor=executor)
attn    = SpectralAttention(graph, hidden_dim=256)  # inherits from graph

# ── Option B: load from config ───────────────────────────────────────────────
cfg = SMGPConfig.from_dict({
    "hd_dim": 10000,
    "hardware": {"enabled": True, "device": "/dev/smgpu0", "fallback": True},
})
executor = create_executor_from_config(cfg)
graph    = SpectralMemoryGraph(hd_dim=cfg.hd_dim, seed=cfg.seed, executor=executor)

All heavy operations (compute_laplacian, compute_eigen, bind, unbind, bundle, similarity, forward) delegate to the executor when one is present, and silently fall back to the pure-Python implementation if the executor raises NotImplementedError.

See hardware/docs_hw/integration_guide.md for the full HAL API, memory mapping, PCIe driver setup, and end-to-end workflows.

FPGA Build

cd hardware/fpga/build

# Synthesize
make synth

# Implement and generate bitstream
make impl

# Generate Alveo U280-compatible XCLBIN (requires Vivado + Vitis)
make xclbin

# Alternative: write .bit file only
make bit_script

Estimated resource utilisation on Xilinx Alveo U280:

Resource	Systolic (16x16)	HD Engine	Topology	Total (est.)
LUT	~45 K	~18 K	~12 K	~95 K
FF	~38 K	~15 K	~10 K	~78 K
BRAM	48	24	16	120
DSP	256	64	16	384
URAM	32	8	4	48

See hardware/docs_hw/architecture.md for detailed microarchitecture documentation and ASIC migration path (7nm/5nm estimates).

RTL Simulation

Lint the RTL:

cd hardware
bash sim/scripts/run_verilator.sh lint

Run a single engine testbench:

cd hardware
bash sim/scripts/run_verilator.sh tb_spectral_engine

Run the full simulation suite (lint + all 6 testbenches):

cd hardware
bash sim/scripts/run_verilator.sh all
# Results: 6/6 passed — All tests PASSED (Verilator 5.048)

Run Python golden model comparison:

cd hardware
python sim/models/gold_spectral_model.py
python sim/models/gold_hd_model.py

Performance Targets

Hardware (FPGA @ 250 MHz)

Operation	Latency	Throughput
Laplacian (4K nodes)	~82 us	49 M edges/s
Eigen-decomp (K=64)	~5.2 ms	12.3 K iterations/s
Chebyshev conv (K=4)	~0.33 ms	12.2 M nodes/s
HD bind/unbind (10K-dim)	1 cycle	250 M ops/s
HD similarity (10K-dim)	313 cycles	0.8 M queries/s
Topology barcode (4K nodes)	~1.4 ms	2.9 K graphs/s
Wasserstein distance	~0.7 ms	1.4 K comparisons/s
DPO match (pattern 4 nodes)	~32 us	31 K patterns/s

Scalability (Parameterised)

Dimension	Small (FPGA)	Medium (FPGA)	Large (ASIC)
Systolic array	8 x 8	16 x 16	32 x 32
HD banks	4	16	64
HD dimension	4,000	10,000	40,000
NoC mesh	2 x 2	4 x 4	8 x 8
HBM channels	1	4	16
Max graph nodes	1,024	4,096	65,536
Clock frequency	250 MHz	250 MHz	1.2 GHz (7nm)

Benchmarking

SMGP includes a built-in benchmarking CLI:

# Run default benchmarks
smgp bench

# Run with custom parameters
smgp bench --num-nodes 1000 --num-queries 100 --hd-dim 10000

# Run specific benchmark
smgp bench --benchmark attention --seq-length 4096

The benchmark measures:

Throughput (queries/sec) for graph construction, similarity search, and verification
Recall@K for HD similarity retrieval
Latency (p50, p95, p99) for each operation
Memory usage for graph storage

Programmatic Benchmarking

from smgp.utils.bench import BenchmarkRunner

runner = BenchmarkRunner(
    hd_dim=10000,
    seed=42,
    num_nodes=1000,
    num_queries=100,
)

results = runner.run_all()
for name, metrics in results.items():
    print(f"{name}:")
    for k, v in metrics.items():
        print(f"  {k}: {v}")

Testing

Software Tests

pip install -e ".[dev]"
python -m pytest tests/ -v

With coverage:

python -m pytest tests/ -v --cov=smgp --cov-report=term-missing

Run specific test modules:

python -m pytest tests/test_graph.py -v
python -m pytest tests/test_spectral.py tests/test_hyperdim.py -v
python -m pytest tests/test_end_to_end.py -v

Lint:

ruff check src/ tests/

Hardware Tests

RTL-level tests (Verilator 5.048 — 6/6 PASS):

Test	Status	Description
`tb_spectral_engine`	✅	Laplacian computation, Chebyshev convolution, wavelet transform
`tb_hd_engine`	✅	Bundle, bind/unbind, permute, similarity, random generation
`tb_topology_engine`	✅	Filtration, Union-Find, barcode emission, Wasserstein
`tb_graph_rewrite`	✅	DPO pattern matching, rule application, proof trace
`tb_associative_memory`	✅	CAM read/write, HD similarity threshold, associative query
`tb_system_end_to_end`	✅	Full system pipeline with AXI-Lite config and multi-engine dispatch

cd hardware && bash sim/scripts/run_verilator.sh all

Python-level tests:

# Golden model validation
python -m pytest hardware/tests/test_hardware_integration.py -v

# HAL integration
PYTHONPATH=src:hardware/sw python -m pytest hardware/tests/test_hardware_integration.py -v

Enhanced Tests

The enhanced test suite covers 26 test modules across tests/ and tests_enhanced/:

# Run all tests (Python 3.10+, no PyTorch required)
python -m pytest tests/ tests_enhanced/ -v

# Run with full optional deps (including PyTorch, Python 3.12)
python -m pytest tests/ tests_enhanced/ -v

# Property-based testing (Hypothesis)
python -m pytest tests_enhanced/test_hd_hypothesis.py -v

# Run with coverage across both suites
python -m pytest tests/ tests_enhanced/ -v --cov=smgp --cov-report=term-missing

Environment	Passed	Skipped	Notes
Python 3.10–3.14 (no PyTorch)	230	5	Skips: torch-dependent ONNX tests
Python 3.12 + PyTorch 2.x	235	0	All tests pass

Test Suite	Tests	Description
`tests/`	112	Core graph, spectral, HD, topology, category, attention, reasoning, integration, hardware wiring
`tests_enhanced/`	123	CLI, distributed, event log, multimodal, ONNX/vLLM, streaming attention, vector DB, federation, prune policies, explainability, hypothesis property tests
Mutation Testing	CI	Automated mutation testing via `enhancements/workflows/mutation_testing.yml`
ASV Benchmarks	`benchmarks/`	Regression benchmarks tracked with Airspeed Velocity

Contributing

Contributions are welcome! Please follow these steps:

Fork the repository
Create a feature branch (git checkout -b feature/my-feature)
Write tests for your changes
Ensure all tests pass (python -m pytest tests/ -v)
Lint your code (ruff check src/ tests/)
Commit with conventional commits
Push and open a Pull Request

Development Setup

git clone https://github.com/rotsl/smgp.git
cd smgp
pip install -e ".[dev]"
pre-commit install  # if available

Hardware Contribution Guidelines

Follow SystemVerilog 2017 (IEEE 1800-2017) coding conventions
All new RTL must include a Verilator-lintable testbench
Golden model comparisons must pass with < 1% maximum absolute error
Document all new ISA opcodes in hardware/docs_hw/isa_reference.md

License

SMGP is licensed under the Apache License 2.0. See LICENSE for details.

Citation

If you use SMGP in your research, please cite:

@software{smgp2026,
  author    = {Rohan R},
  title     = {Spectral Memory Graph Processor (SMGP)},
  version   = {0.1.0},
  year      = {2026},
  url       = {https://github.com/rotsl/smgp},
}

See CITATION.cff for the full citation file.

References

Foundational Theory

Chung, F.R.K. (1997). Spectral Graph Theory. CBMS Regional Conference Series.
Kanerva, P. (1988). Sparse Distributed Memory. MIT Press.
Plate, T.A. (1995). "Holographic Reduced Representations." IEEE Transactions on Neural Networks, 6(3), 623-641.
Edelsbrunner, H., Letscher, D., & Zomorodian, A. (2002). "Topological Persistence and Simplification." Discrete & Computational Geometry, 28(4), 511-533.
Ehrig, H., Ehrig, K., Prange, U., & Taentzer, G. (2006). Fundamentals of Algebraic Graph Transformation. Springer.

Spectral Methods & Attention

Hammond, D.K., Vandergheynst, P., & Gribonval, R. (2011). "Wavelets on Graphs via Spectral Graph Theory." Applied and Computational Harmonic Analysis, 30(2), 129-150.
Defferrard, M., Bresson, X., & Vandergheynst, P. (2016). "Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering." NeurIPS.
Lee, J., Lee, Y., Kim, J., et al. (2019). "Set Transformer: A Framework for Attention-based Set-to-Set Learning." NeurIPS.
Vladymyrov, M. & Carreira-Perpinan, M. (2022). "Spectral Attentions for Graphs." ICLR.

Hardware Architecture

Kung, H.T. (1982). "Why Systolic Architectures?" IEEE Computer, 15(1), 37-46.
Dally, W.J. & Towles, B.P. (2004). Principles and Practices of Interconnection Networks. Morgan Kaufmann.
Ielmini, D. & Wong, H.-S.P. (2018). "In-Memory Computing with Resistive Switching Devices." Nature Electronics, 1(6), 333-343.
Hennessy, J.L. & Patterson, D.A. (2019). Computer Architecture: A Quantitative Approach. 6th Ed.

For full mathematical foundations, see RESEARCH.md. For hardware architecture details, see hardware/docs_hw/architecture.md. For the complete ISA reference, see hardware/docs_hw/isa_reference.md.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github		.github
benchmarks		benchmarks
docs		docs
enhancements		enhancements
examples		examples
hardware		hardware
notebooks		notebooks
src/smgp		src/smgp
tests		tests
tests_enhanced		tests_enhanced
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
PACKAGE.md		PACKAGE.md
README.md		README.md
RESEARCH.md		RESEARCH.md
ROADMAP.md		ROADMAP.md
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
setup.cfg		setup.cfg
setup_speed.py		setup_speed.py

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

SMGP — Spectral Memory Graph Processor

Table of Contents

Why SMGP?

Features

Core (Software)

Memory & Attention

Reasoning

Integration

Hardware Accelerator (SMGPU)

Repository Structure

Enhancements

Software (12)

Infrastructure (5)

Hardware (4)

Community (3)

Installation

From PyPI

With optional dependencies

From source

Hardware simulation prerequisites

Quick Start

Software

Hardware-accelerated

Usage Examples

Knowledge Graph Construction

Spectral Analysis

Hyperdimensional Memory

Claim Verification

Neuro-Symbolic Planning

Long-Context Processing

Architecture Overview

Software Stack

Module Map

Hardware Accelerator (SMGPU)

Compute Engines

Memory Subsystem

Interconnect

Design Principles

Configuration

YAML Configuration

Load Configuration

Integrations

HuggingFace Transformers

LangChain

REST API

Hardware Acceleration

Hardware Architecture

ISA Overview

Python HAL

FPGA Build

RTL Simulation

Performance Targets

Hardware (FPGA @ 250 MHz)

Scalability (Parameterised)

Benchmarking

Programmatic Benchmarking

Testing

Software Tests

Hardware Tests

Enhanced Tests

Contributing

Development Setup

Hardware Contribution Guidelines

License

Citation

References

Foundational Theory

Spectral Methods & Attention

Hardware Architecture

About

Topics

Resources

License

Code of conduct

Uh oh!