Skip to content
View Manzela's full-sized avatar
💻
Building autonomous AI systems at scale
💻
Building autonomous AI systems at scale

Block or report Manzela

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Manzela/README.md

Daniel Manzela

Founding AI Product Builder

Co-founder, CEO and CPO at TNG Shopper.


LinkedIn Google Cloud Skills Pipeline Observatory


What I Build

I design and operate end-to-end autonomous AI systems — from zero-to-one architecture through production. My work sits at the intersection of multi-agent orchestration, fail-closed safety, and LLM evaluation. The systems I build run with zero human oversight at enterprise scale.

Current system in production (TNG Shopper, 2024 → present):

  • 11 enterprise clients across 5 countries (ES · PT · IL · US · MX) — ~10.5M product pages under autonomous management at $0.0006 / page
  • 7-node multi-agent directed acyclic graph with ~73.5M agent operations per full run (weekly executions + daily price-delta runs) · 234 managed websites
  • Gemma 4 26B-A4B Mixture-of-Experts on self-hosted vLLM with PagedAttention inference. Multi-Low-Rank-Adaptation research documented in the forensic runbook.
  • Originality, Relevance, Accuracy, Value — four-axis multi-model evaluation with fail-closed policy at 68.9% pass rate by design · Deterministic Evaluation and Monitoring Audit System enforcing every boundary

Ten years to get here. Six projects. The pattern: how to unblock human-dependencies. See the profile time-spine →.


System Architecture

graph LR
  subgraph Inference["Inference Layer"]
    MoE["Gemma 4 26B MoE<br/>vLLM · PagedAttention"]
    L1["LoRA α"]
    L2["LoRA β"]
    L3["LoRA γ"]
    MoE --> L1 & L2 & L3
  end

  subgraph DAG["7-Node Autonomous Pipeline"]
    N1["City DNA<br/><sub>Context</sub>"] --> N2["Normalizer<br/><sub>4 sub-agents</sub>"]
    N2 --> N3["Synonyms<br/><sub>Expand</sub>"]
    N3 --> N4["SV Gate<br/><sub>Filter</sub>"]
    N4 --> N5["Writer<br/><sub>Generate</sub>"]
    N5 --> N6["Validator<br/><sub>O-R-A-V</sub>"]
    N6 --> N7["Features<br/><sub>Vectorize</sub>"]
  end

  subgraph Eval["Evaluation & Safety"]
    ORAV["O-R-A-V Judge<br/><sub>Multi-Model Scoring</sub>"]
    DEMAS["DEMAS Audit<br/><sub>JIT · Fail-Closed</sub>"]
  end

  L1 & L2 & L3 --> N1
  N6 --> ORAV
  DEMAS -.->|"intercept at<br/>every boundary"| N1 & N2 & N3 & N4 & N5 & N6 & N7
  ORAV -.->|"RL feedback<br/>prompt mutation"| N5

  style MoE fill:#1a1a2e,stroke:#0A84FF,color:#fff
  style ORAV fill:#1a1a2e,stroke:#30D158,color:#fff
  style DEMAS fill:#1a1a2e,stroke:#FFD60A,color:#fff
Loading
Node Anatomy — Each node contains multiple sub-agents

Every directed-acyclic-graph node is a bounded ecosystem, not a single LLM call:

Layer Role Example
Deterministic Gate Schema validation, type coercion, regex Pydantic, Python AST
Probabilistic Agent Semantic extraction, classification Gemini Vision, SLM
Autonomy Layer Originality-Relevance-Accuracy-Value scoring, confidence thresholds Multi-model consensus
Memory Long-term state, prompt cache mutation Redis LTM, Firestore

The deterministic gate always fires first. The LLM is invoked only if the gate passes.


Technical Focus

AI & ML

Python Gemma Gemini vLLM LangChain

Infrastructure & MLOps

GCP Vertex AI Docker Redis BigQuery Firestore

Evaluation & Safety

Multi-Agent vLLM Fail-Closed RLHF


Featured Work

The Arc — six case studies, in chronological order

Year Project What it proved
2016 Asset (Sept 2016 — 2019) Three years of solo contractor work for new-stage startups: web setups, ERP→web ETL by hand, spreadsheet automation, business plans. The data-transformation reps every later pipeline compounded on.
2019 Data Mining (Feb 2019 — Jul 2020) Five-stage manually-orchestrated pipeline for an Israeli financial-services firm. ₪50M+ in new Assets Under Management. A pipeline is a series of filters, not a series of steps.
2020 Seller App (Jan 2020 — Apr 2024) Computer vision for retail digitization. 3 computer-vision generations · 60M+ canonical Stock-Keeping Units · $10K Monthly Recurring Revenue plateau. The architectural origin of retrieval-grounded computer vision.
2020 Tasko AI (Oct 2020 — Dec 2023) Production agentic system before the term existed. 21,102 labeled tasks · 153 clients · 1,561 intent patterns · 4-layer Classify / Retrieve / Execute / Verify.
2024 Elysium (2024 — 2025, paused-pending-Pipeline) Physical-Context AI for Retail. 13 brands validated · 15,600+ store locations · 15 patent claims (3 independent + 12 dependent).
2024 Pipeline Observatory (2024 — present) The synthesis. Seven-node directed acyclic graph, deterministic gates first, fail-closed by default. 10.5M product detail pages / month · 73.5M ops / month · $0.0006 / page · 68.9% pass rate across Originality, Relevance, Accuracy, Value.

Open-source distillations (parallel track)

Repository Description
agent-dag-pipeline Open-source distillation of the seven-node directed acyclic graph. Google Agent Development Kit + Vertex AI + four-axis Originality-Relevance-Accuracy-Value evaluation + Direct Preference Optimization data flywheel.
Antigravity-OS · pip install ag-os Orchestration infra that lets a small team ship more product per engineer, safely. Built at TNG (2-engineer dev team) to streamline the SDLC by reducing human-dependencies in the dev loop: AI dev assistants run inside a 9-rule constitutional policy-as-code, with Cost Guard, Flight Recorder for replay, Self-Healing CI, Dreaming Module for offline agent self-improvement, drop-in MCP server.
gemma4-vllm-deployment Forensic runbook documenting 20 failure modes across 30+ deployment versions of Gemma 4 Mixture-of-Experts on Vertex AI with vLLM. The community reference for production Mixture-of-Experts serving.
pipeline-observatory Source of the live observability site at manzela.github.io/pipeline-observatory. Architecture visualization — Mixture-of-Experts sparse routing, causal directed-acyclic-graph tracing, live execution telemetry.
WP-Multisite A WordPress multisite for retailer sites, written from scratch to be natively understood by AI search bots (ChatGPT, Claude, Perplexity, Google AI Overview). Explicit AI-bot ALLOW contracts (OAI-SearchBot, ChatGPT-User, PerplexityBot, ClaudeBot, Applebot, Amazonbot, DuckAssistBot), llms.txt indexing surface, JSON-LD with SpeakableSpecification for voice. Sanitised from the TNG production stack (Sage 10 + custom sunrise.php + 1,056-LOC schema generator).

@Manzela's activity is private