Skip to content
← News

/brew goes fully autonomous.

One free-text request. One Q&A round. End-to-end autonomous execution: plan, decompose, execute in parallel, verify, hand back a PR. No mid-run check-ins. Plus the always-tested rule on top: 100 / 100 / 100 / 100 coverage on every changed line, every time.

An autonomous run: prompt in, silence, PR out // 14:23 · INPUT $ /brew_chai:brew "add OAuth to settings" 8 PHASES · 0 INTERRUPTS [ 47 MINUTES OF SILENCE ] // 15:10 · OUTPUT PR · READY Add OAuth · 11 commits cov 100/100/100/100 PROMPT PIPELINE PR prompt in · silence · pr out

Brew Chai shipped publicly three weeks ago with thirty-five commands. The complaint we heard most was the obvious one: which command do I run? The autonomy update closes that question. There is one command. /brew "<what you want>". The router does the rest.

Before · v1

"Run plan-this, then prep-sprints, then run-sprint, then deep-clean if it's tier-2, then audit-fix…"

a toolkit · the operator picks the workflow · 35 entrypoints to memorise

Now · v4.1

"brew "add OAuth login to the settings page". The router classifies, plans, executes, verifies, returns a PR."

an autonomous engineer · one entrypoint · 21 utility commands stay for read-only work

The eight phases

Every /brew run flows through the same pipeline, branching by intent class (Build, Fix, Heal, Tool). The phases are visible to the operator. Brew Chai is autonomous, not opaque.

01 / Setup

Setup

Repo context, conventions, branch posture.

02 / Scope

Scope

Classify intent. Build, Fix, Heal, or Tool.

03 / Ambiguity

Ambiguity

Find what is genuinely uncertain in the request.

04 / Q&A

Q&A

One round. Stakes-tagged. Low stakes default-accept.

05 / Plan

Plan

Critiqued plan. Reviewer cycles to a numeric gate.

06 / Decompose

Decompose

Sprints, tasks, dependency graph, agent assignment.

07 / Execute

Execute

Parallel agent worktrees. Audit-fix loops. Coverage gate.

08 / Handoff

Handoff

PR, summary, escalation log, knowledge extracted.

Phase 4 is the critical one. Q&A is bounded. One round of questions, then silence. Brew Chai surfaces every genuine ambiguity it can find: but each question carries a stakes tag, and only high-stakes questions block on an operator answer. Low-stakes questions ship a default and proceed. If the operator is at lunch, the run does not wait.

$ brew "add OAuth login to the settings page"

  [1/8] setup      ✓ branch from main · conventions loaded
  [2/8] scope      ✓ build · feature-add · scope=medium
  [3/8] ambiguity  found 4 questions · 1 high-stakes
  [4/8] q&a

  ? Which providers? [high-stakes: affects tests + UI]
     google, github

  · Cookie name?         [low, defaulting to "session"]
  · Token TTL?           [low: defaulting to 24h, follows existing]
  · Use existing UI kit? [low: yes, defaulting to /ui]

  [5/8] plan       ✓ 4 reviewers · 2 critique cycles · 97/100
  [6/8] decompose  ✓ 3 sprints · 11 tasks · graph clean
  [7/8] execute    → parallel agent swarm in worktrees
                  ✓ sprint 01 · provider scaffold       (cov 100/100/100/100)
                  ✓ sprint 02 · settings UI integration (cov 100/100/100/100)
                  ✓ sprint 03 · session middleware      (cov 100/100/100/100)
                  ✓ deep-clean · 6 auditors · all green
  [8/8] handoff    ✓ PR #1284 · 11 commits · 0 deferrals
                  → DONE · 47 min · 0 mid-run interrupts

Always-tested, by default

The other half of the update is the always-tested rule, on by default in v4.1. Every code change runs through coverage-verify against the diff. Statements, branches, functions, and lines on new code each have to hit 100. The pre-existing untested code on the file is not the run's problem; the new lines are.

This is enforced through three mechanisms working together:

  • Coverage gate inline in Phase 7. A task whose new lines do not hit 100/100/100/100 is not allowed to merge. The implementer agent receives the diff, writes the missing tests, and re-runs. Coverage-suppression markers (/* istanbul ignore */, # pragma: no cover, threshold-lowering edits in coverage configs) are rejected by the anti-defer pre-commit hook.
  • Regression-test-first for Fix-class. Every bug fix ships with a paired regression test that fails with the bug present and passes once the fix lands. No "fixed it, will add the test later." The test is the proof the bug existed.
  • Coverage debt logged, not waived. Pre-existing untested lines on touched files feed a coverage-debt heal backlog. A later /brew "heal --coverage" run picks them up; the current task is not blocked on legacy untested code.

Healing woven in

Slop is what we call LLM-induced bloat: over-engineering, defensive code that catches impossible exceptions, redundant comments that re-narrate the symbol name, dead config flags, premature abstraction. The slop-detector and heal-coordinator now run inside every /brew phase, not as a separate deep-clean command. Every touched module is supposed to leave the codebase measurably better than it was found. If it does not, the run flags itself.

Combined with the anti-defer hook. TODOs, type suppressions, lint suppressions, "for now" framing all rejected at commit: the operating posture is: within scope, fix it properly. Zero tolerance for "pre-existing" excuses.

Plateau as blocker

Phase 7's qualitative iteration loop now treats a plateau as a brew-internal-bug signal. If the audit-fix loop stops making progress against a finding, it does not silently lower the bar and ship. It logs an escalation, surfaces the resistant issue, and asks Brew Chai's own engineering for a workaround pattern. Plateaus inform the next plugin release; they do not become deviations on the operator's PR.

Sub-checkpoint resume and Ralph mode

Every sub-step inside Phase 7: write test, write impl, verify coverage, self-review, heal, is checkpointed. brew --resume <run-id> picks up at the last completed sub-step, not the last completed task. Phase 8 handoff finishes with a deterministic state regardless of how many resumes the run took.

brew --ralph uses the Stop hook to keep the session running unattended. The Ralph loop fires Brew Chai again on every Stop event until the run completes, the operator interrupts it, or the escalation log surfaces something only a human can answer.

One free-text request. One Q&A round. Then silence, until either the PR lands or the escalation log says why it cannot.

Free tier on this update covers Fix-class and Tool-class intent: bug investigations, single-file fixes, audits, diagnostics, the /brew "heal --coverage" backlog runs. Build-class and full Heal-class: multi-sprint plans, parallel worktrees, the deep-clean swarm, are Pro. The reasoning, plus the rest of the surface, lives on /brew-chai/pricing.