Council Review: Audit Findings Validation and Sprint 46 Plan

Decision

Sprint 46 must shift from adding features to hardening observability, error-handling contracts, and runtime decomposition — the three audits collectively show that silent fallbacks and null-overloading are now the highest-leverage sources of brittleness, outweighing any feature gap.

Evidence Reviewed

Methods

This review was conducted as a 6-seat subagent council with explicit user permission. Each seat was given the full evidence pack and asked to produce independent findings, risks, recommendations, assumptions, open questions, and confidence percentages. The Synthesizer merged all reviews into a unified plan.

Findings

Seat Recommendation Confidence Blocking concern
Source-Grounded Archivist 13 of 14 validated claims are accurate; 8 are worse than originally claimed. The most actionable new finding is the 3-LLM-provider catch→null cluster (M-1) and the WebSocketBridge fire-and-forget receive loop (M-2). Fix those first. 0.92 The audit missed: ChatServices.cs 20+ bare catch blocks in the MemorySmith base repo. That's outside scope for the Agent repo but must be surfaced as a cross-repo request.
Data Model Architect The null-overloading problem (11+ meanings) is the single highest-leverage schema fix. Introduce ReplanResult, ParseResult, and a BuildOrigin value object. These three schema changes eliminate entire classes of bugs without changing any behavioral logic. 0.88 BuildOrigin value object changes HasExplicitOrigin semantics from OR to AND — this is a breaking change requiring updates to all 4 origin-resolution touch points. Must be gated behind tests first.
Retrieval Specialist World KB fallback (Validated Claim 7) is the only retrieval-affecting finding. The alias-drift (Claim 10) is a semantic retrieval concern — different canonical IDs produce different search results. Neither is P0. 0.85 The alias-drift (TSK-0099) has LOW priority but the semantic impact is real: "iron"→"raw_iron" vs "iron"→"iron_ore" means search returns different results depending on path.
Human Learning Advocate The silent-failure culture hurts developers most. 5 distinct silent-failure patterns (catch→null, catch→return false, catch{}, fire-and-forget, ReceiveLoopAsync) mean every bug investigation starts with "did it actually fail?" not "what went wrong?" 0.82 Without structured error results, every bug requires reading debug logs. The 3 LLM providers' catch→null means LLM integration bugs are invisible until a user reports "the bot ignored me."
Skeptical Reviewer The audit claims are well-supported but over-prioritize new architectural constructs. BuildOrigin value object and ReplanResult are good designs but building them doesn't fix the core problem — silent failures. Fix the catch→null blocks with logging first (5-minute fixes), then design new types. 0.78 Claim 14 (3 origin layers) is architecturally messy but has caused zero known bugs. Replan losing goal context (Claim 8) IS a real bug pattern. Merge claims 1, 2, 3, 14 into one "BuildOrigin consolidation" task.
Synthesizer Sprint 46 must address ALL catch→null and fire-and-forget paths as P0, then BuildOrigin consolidation as P1, then runtime decomposition as P2. The 3-LLM-provider catch→null plus ChatServices catch blocks are the most impactful 10-minute fix in the entire backlog. 0.90 The Sprint 45 plan (Wave A/B) was well-executed based on the previous council report. These audits surface a deeper layer of issues that Wave A/B didn't touch. Sprint 46 needs a different theme.

Synthesis

SPRINT 46 THEME: "Observability First" — fix all silent failure paths, then consolidate the build-origin model, then begin runtime decomposition planning.

Sprint 46 Plan

Priority Area Description Task ref
P0 Observability WebSocketBridge.ReceiveLoopAsync: wrap in try-catch with log + reconnect TSK-0100
P0 Observability All 7 catch→null paths: add LogWarning before returning null TSK-0101
P0 Cross-repo Document ChatServices.cs 20+ bare catch blocks as base-repo request TSK-0102
P1 BuildOrigin Consolidate BuildOrigin model: merge TSK-0098 + TSK-0095 into single value object TSK-0103
P1 Planner Introduce ReplanResult type + fix unused originalGoal parameter in ReplanAsync TSK-0104
P2 Docs Fix documentation drift: ReplanGovernor, HtnPlanner, phase-status memory TSK-0105
P2 Aliases Extract shared alias dictionaries (TSK-0099) — promote from P3 TSK-0099
P2 Tests Add error-path tests covering all catch→null and fire-and-forget locations TSK-0106
P3 Tests Complete remaining checkpoint tests (TSK-0083 remaining ~50%) TSK-0083
P3 Architecture Runtime decomposition planning: scope 6 interfaces, create extraction plan TSK-0107
P3 Cleanup State duplication (M-4): remove redundant _currentGoal field access TSK-0108

Updated Existing Tasks

Dissent

Disagreement on BuildOrigin sentinel severity. The Source-Grounded Archivist and Data Model Architect rate the sentinel-overloading as HIGH because it violates the "no magic numbers" rule from AGENTS.md. The Skeptical Reviewer argues the collision probability is ~0.1% and the sentinel is documented. Resolution: Fix during BuildOrigin consolidation (P1) but do NOT prioritize separately.

Disagreement on ReplanResult priority. The Human Learning Advocate argues it should be P0 because "null overloading is the #1 debugging friction." The Skeptical Reviewer argues catch→log fixes are P0 while structural type changes are P1. Resolution: Two-phase approach. Catch→log is TSK-0101 (P0, 15 minutes). ReplanResult type is TSK-0104 (P1, 2 hours).

Disagreement on runtime decomposition timeline. The Data Model Architect wants decomposition to start immediately. The Skeptical Reviewer argues the God Object (~80KB) works and decomposing it prematurely could introduce bugs. Resolution: Planning (TSK-0107) starts Sprint 46. Extraction starts Sprint 47+.

Acceptance Criteria

Open Questions

  1. WebSocketBridge ownership: Is WebSocketBridge owned by MinecraftAdapter or AgentBackgroundService? Need to verify the lifecycle before restructuring.
  2. TSK-0102 handoff process: Who in the MemorySmith base repo owns ChatServices.cs? Suggest @TheMasonX as initial reviewer.
  3. LLM provider abstraction: The 3 LLM providers (Anthropic, Gemini, OpenAI) share the same catch→null pattern. Would a shared base class or middleware simplify all 3 fixes?
  4. Coverage collection: Should Sprint 46 include coverage instrumentation? The error-path tests (TSK-0106) would benefit from knowing which catch blocks are actually covered.
  5. Event loop health check: After fixing WebSocketBridge.ReceiveLoopAsync (TSK-0100), should we add a periodic health check ("PING" event every 30s, crash if no response in 60s)?

General Themes & Cross-Cutting Patterns

The three audits and this council review surfaced four major cross-cutting themes:

Theme 1: Silent Failure is the Dominant Risk Pattern (HIGHEST CONCENTRATION)

Theme 2: Null is Overloaded Everywhere (11+ meanings)

Theme 3: Architectural Migration Debt

Theme 4: Documentation Drift is a Trust Problem

Patterns That Were NOT Called Out in the Original Audits

This council discovered 8 missed patterns (M-1 through M-8):

# Pattern Files Severity Found By
M-1 LLM provider catch→null bare catches Anthropic/Gemini/OpenAI providers HIGH Cross-reference LLM providers
M-2 WebSocketBridge ReceiveLoopAsync fire-and-forget WebSocketBridge.cs CRITICAL Trace event receive path
M-3 int.MinValue sentinel HtnPlanner + HtnTaskLibrary MEDIUM Search for sentinel values
M-4 State duplication AgentBackgroundService (multiple fields) LOW-MEDIUM Trace state ownership
M-5 ReplanGovernor XML doc stale ReplanGovernor.cs LOW Compare doc to code
M-6 HtnPlanner XML doc misleading HtnPlanner.cs LOW Compare doc to code
M-7 ChatServices.cs bare catches (base repo) MemorySmith.App/Services/ChatServices.cs HIGH Search for catch-all blocks
M-8 Test coverage gaps Tests project MEDIUM No tests for any catch→null path

The common root cause across all missed patterns: the audits focused on what they set out to find (planner architecture, build origin, runtime design) and didn't systematically search for identical patterns in adjacent code (LLM providers, WebSocket bridge, base repo services).