Skip to content

Cookbook/cookbooks revamp#657

Open
SuhaniNagpal7 wants to merge 6 commits into
devfrom
cookbook/cookbooks-revamp
Open

Cookbook/cookbooks revamp#657
SuhaniNagpal7 wants to merge 6 commits into
devfrom
cookbook/cookbooks-revamp

Conversation

@SuhaniNagpal7
Copy link
Copy Markdown
Contributor

Pull Request

Description

Describe the changes in this pull request:

  • What feature/bug does this PR address?
  • Provide any relevant links or screenshots.

Checklist

  • Code compiles correctly.
  • Created/updated tests.
  • Linting and formatting applied.
  • Documentation updated.

Related Issues

Closes #<issue_number>

Suhani Nagpal added 6 commits May 14, 2026 02:39
…and explainable steps

- Replace TLDR with outcome-first intro paragraph naming the concrete outcomes
- Add 'What is an eval correction loop?' section explaining the technique
- Beef up each Step intro with the why before the how (surface-form evals miss
  domain rules; disagreements are the calibration signal; rule prompt has two
  jobs; same-batch re-scoring is the controlled variable; iteration discipline)
- Replace samples with subtler failures (off-policy refund commits, in-support
  upsells) so the demo actually lands: is_helpful passes them, custom eval
  catches them, agreement jumps 50% to 100%
- Add 'What you solved' closer with 4 specific pain to solution bullets
- Document the full create_custom_evals response shape (status + result.eval_template_id)
- Drop all em/en dashes per house style
…ntro and explainable steps

- Replace TLDR with outcome-first intro paragraph naming the concrete outcomes
- Add 'What is MCP?' section defining the protocol and FAGI's tool surface
- Beef up each Step intro with the why before the how (config registration
  is the trust boundary; OAuth scopes are user-revocable; assistant reads
  tool descriptions to pick the right tool; chat carries context across
  turns; full debug loop in one IDE thread is the payoff)
- Add 'What you solved' closer with 4 specific pain to solution bullets
- Fix broken inline link: [agent.py](agent.py) -> backticked agent.py
- Drop all em/en dashes per house style
…ncrete intro and explainable steps

- Replace TLDR with outcome-first intro paragraph (21 containers, dashboard URL,
  local backend URL, smoke-test trace) leading with what the reader gets
- Add 'What is the self-hosted stack?' section with the four-layer architecture
  (apps, databases, workflow engine, replication stack) named by purpose first
  and technology second so non-experts can skip the parens
- Beef up each Step intro with the why before the how:
  * Step 1: the repo IS the install, no separate package manager
  * Step 2: the four secrets are baked into running services at boot
  * Step 3: docker compose up builds + starts in dependency order
  * Step 4: why three URLs (frontend + backend + PeerDB admin)
  * Step 5: a single trace is a smoke test for every layer
- Split the env-config and Django-shell password tip across Steps 2 and 4
- Add jump-link from Step 2 to Step 4 for the no-Mailgun path
- Add 'docker compose ps --format' to keep the health table scannable
- Add 'What you solved' closer with 4 specific pain to solution bullets
- Drop all em/en dashes per house style
…intro and explainable steps

- Replace TLDR with outcome-first intro paragraph naming the concrete outcomes
  (exact + semantic caching enabled, paraphrases returning cached answers,
  bypass/invalidate path) plus the one-line app-code-change closer
- Add 'What is Agent Command Center?' section explaining the gateway and the
  two cache tiers (L1 exact + L2 semantic) before diving into steps
- Beef up each Step intro with the why before the how:
  * Step 1: baseline as a 'no caching' control; name the three response headers
  * Step 2: explain what L1 exact-match actually does (hash + store)
  * Step 3: paraphrased duplicates miss L1; vector embedding + similarity for L2
  * Step 4: why a realistic mixed batch measures actual hit rate vs single hit
  * Step 5: two real bypass situations (prompt change, post-fix invalidation)
- Add 'What you solved' closer with 4 specific pain to solution bullets
- Fix dashboard navigation: 'Gateway -> Providers -> Cache' (not 'Agent Command
  Center -> Caching') matching the real UI
- Clarify L1 Backend value (memory) vs the L1 layer (always exact-match)
- Drop all em/en dashes per house style
…l-task and alert-monitor API

- Four cost levers (sampling, judge model tiers, span filters, alert-driven
  attention) laid out as a table up front, then one explainable step per lever
- Step 1: judge model comparison with a sanity-check snippet across the three
  Turing tiers; recommend turing_flash unless evidence justifies escalation
- Step 2: filter to user-facing LLM spans before sampling so the math runs
  against the right denominator; embeds the Create Task UI screenshot
- Step 3: sampling-rate cheatsheet keyed to daily LLM volume; sampling_rate +
  spans_limit combo for hard ceiling on traffic spikes
- Step 4: alert-monitor wiring with percentage_change threshold so the only
  time a human looks at the dashboard is when a real regression fires
- 'What you solved' closer with 4 specific pain to solution bullets
- 'Eval task API' reference section with verified field names (project, evals,
  run_type, sampling_rate, spans_limit, filters) and status enum
- All field names verified against EvalTaskSerializer and UserAlertMonitorSerializer
- Colab + GitHub badges added
- Drop all em/en dashes per house style
…injection, jailbreak, and social engineering testing

- Outcome-first intro: regression suite of three adversarial classes with
  prompt_injection, answer_refusal, and is_harmful_advice evals scoring every
  response, plus a fail list of exact prompts that broke through
- 'What is adversarial simulation?' section names the three attack classes
  (prompt injection, jailbreak, social engineering) with concrete patterns
- Step 1: three custom hostile personas (Script Injector, DAN Jailbreaker,
  Social Engineer) with persona descriptions and behavioural settings
- Step 2: scenario auto-generated via Workflow Builder, 30 conversation paths
  pairing personas with concrete attack situations
- Step 3: attach three security evals to the run, with verified required-input
  signatures (prompt_injection: input only; answer_refusal: input+output;
  is_harmful_advice: output only)
- Step 4: read the fail list and patch the system prompt with a paste-ready
  security-rules diff
- 'What you solved' closer with 4 specific pain to solution bullets
- Colab + GitHub badges linked to the matching notebook
- Drop all em/en dashes per house style
@SuhaniNagpal7 SuhaniNagpal7 requested a review from nik13 May 15, 2026 05:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant