Intercept and inspect Coding Agent API traffic from Claude Code, Codex CLI, Gemini CLI, Cursor CLI, OpenCode, Kimi, Pi, and Hermes in a local trace viewer.
-
Updated
May 17, 2026 - Python
Intercept and inspect Coding Agent API traffic from Claude Code, Codex CLI, Gemini CLI, Cursor CLI, OpenCode, Kimi, Pi, and Hermes in a local trace viewer.
The open-source MultiAgentOps evaluation and verification harness for any industry business workflow.
🔍 AI observability skill for Claude Code. Debug LangChain/LangGraph agents by fetching execution traces from LangSmith Studio directly in your terminal.
Local open-source dev tool to debug, secure, and evaluate LLM agents. Provides static analysis, dynamic security checks, and runtime monitoring - integrates with Cursor and Claude Code.
Cut your OpenClaw / ZeroClaw token bill. Find which model earns its cost. Prove whether optimizations actually work. Local, no upload.
Local replay debugger for Browser Use failures with screenshots, model I/O, failed-step timelines, and public-safe HTML exports.
Diagnose your AI agents in production. Extract policies from prompts, evaluate traces, generate diagnostic reports.
Visual debugging, tracing, and replay for agent workflows.
Explain why your agent failed — root-cause debugging, memory attribution, and run divergence for LLM agents.
🔍 A beautiful web viewer for AI agent session files. Browse Claude Code & OpenClaw conversations with chat-style UI, timeline visualization, and zero setup.
A real-time observability and debugging layer for AI agents.
ChainWatch is a flight data recorder for multi-step AI systems. It's a CLI-based tool that records every step in an AI decision chain, links them together in order, prevents tampering, and allows you to verify the chain's integrity and replay the full decision flow.
Local recorder and replay verifier for AI-agent command runs.
Failure attribution for agent pipelines — find which span caused the failure and what kind of fix it needs.
RunLens helps teams compare and debug AI agent runs with step timelines, run diffs, and cost analysis.
Android Agent Reliability Runtime A debugging and safety runtime for mobile GUI agents: detect readiness, block unsafe actions, verify progress, diagnose failures, and save reproducible traces.
Enforce communication discipline & execution hygiene for agent teams. Detect loops, route violations, stale work, and missing ownership.
Verify that AI agents actually executed API/tool calls they claim.
LangGraph pipeline that diagnoses failed agent runs: classifies failures, grounds them in source code, pauses at a human approval gate, generates regression tests, writes the customer memo, and learns reusable patterns for future diagnoses.
Add a description, image, and links to the agent-debugging topic page so that developers can more easily learn about it.
To associate your repository with the agent-debugging topic, visit your repo's landing page and select "manage topics."