Codebase Audit Task Vetting - 2026-05-23
Scope
- Included: active task backlog under
Data/Tasks, task system usage in/tasks, chat-governance work items, admin/config hardening tasks, and quality gaps around browser smoke and live wiki validation. - Excluded: broad feature ideation beyond the current imported backlog and user-authored tasks from
Themasonxunless needed for context. - Timebox: focused medium audit.
Evidence Reviewed
README.mdMemorySmith.App/Services/TaskDomainService.csMemorySmith.App/Controllers/TasksController.csMemorySmith.App/Services/AdminSettingsService.csMemorySmith.App/Services/MemorySmithRequestGuardMiddleware.csMemorySmith.App/Services/OperationalDiagnosticsService.csMemorySmith.App/Components/Pages/Chat.razorMemorySmith.Tests/AppApiContractTests.csMemorySmith.Tests/SecurityAndSourceLinkTests.cslogs/agent-smith-20260523-chat-agent-testing-tracker.mdlogs/agent-smith-20260523-mcp-tooling-audit.md
Findings
| ID | Domain | Severity | Confidence | Summary | Evidence |
|---|---|---|---|---|---|
| F-001 | Backlog governance | Medium | 94% | The non-Themasonx backlog is entirely the imported tracker-import set, and many records still carry placeholder descriptions rather than sprint-ready acceptance criteria. |
Data/Tasks/*.json, README.md, TaskDomainService.cs |
| F-002 | Backlog accuracy | Medium | 96% | TSK-0009 is stale: editable admin settings are already implemented and documented. |
README.md, AdminSettingsService.cs, project-wiki-admin-auth-hardening |
| F-003 | Quality gates | Medium | 88% | Core routes are documented and tested at API/component levels, but no dedicated browser smoke workflow was found for the main workbench routes. | README.md, MemorySmith.Tests/*, imported TSK-0002 |
| F-004 | Data validation | High | 86% | The live Data/Memories corpus is used as product content and test fixture input, but no dedicated whole-wiki validation task or command was found. |
README.md, ProjectWikiTestbaseTests.cs, imported TSK-0003 |
| F-005 | Chat governance | High | 91% | Proposal approval for safe page writes can still fail because chat approvals are effectively coupled to maintenance write-root rules. | logs/agent-smith-20260523-mcp-tooling-audit.md, imported TSK-0016 |
| F-006 | Chat state integrity | High | 90% | Recent testing still reports stale pending counts and badges after mixed approval outcomes, even after batch-approval improvements. | logs/agent-smith-20260523-chat-agent-testing-tracker.md, imported TSK-0017, TSK-0018 |
| F-007 | Remote hardening | High | 89% | Remote API exposure is guarded and diagnosed, but current behavior remains warning-first rather than enforced hardening when AllowRemoteApi=true without an API key. |
MemorySmithRequestGuardMiddleware.cs, OperationalDiagnosticsService.cs, SecurityAndSourceLinkTests.cs, imported TSK-0023 |
| F-008 | Task-system governance gap | Medium | 84% | The task system preserves provenance through reporter, but there is no explicit review-state workflow for imported tasks, making backlog trust maintenance too manual. |
TaskDomainService.cs, Data/Tasks/*.json |
Task Actions
- Archived
TSK-0009as obsolete because the admin settings-editing gap is already closed. - Rewrote
TSK-0002,TSK-0003,TSK-0016,TSK-0017,TSK-0018, andTSK-0023into evidence-backed task records with concrete acceptance criteria and validation notes. - Added
TSK-0029to track imported-task provenance and review-state governance inside/tasks.
Sprint Plan - Stability First Backlog Triage
Sprint Objective
Reduce delivery risk on active governance surfaces before taking additional net-new feature work.
Capacity Assumptions
- Team size: 1 primary maintainer plus agent assistance.
- Effective days: 4 to 5 focused days.
- Risk buffer: 25% for chat-governance repro and validation churn.
Committed Items
TSK-0016Fix safe chat page approvals and tighten proposal-time path validation.TSK-0017Reconcile pending-write counters and badges after mixed outcomes.TSK-0003Add live wiki validation forData/Memories.TSK-0002Add browser smoke coverage for core workbench routes.TSK-0029Add review metadata and triage workflow for imported tasks.
Stretch Items
TSK-0018Finish deterministic post-batch reconciliation and close remaining gaps.TSK-0023Harden remote-enable workflows beyond warning-only diagnostics.
Exit Criteria
- Safe chat page proposals can be approved reliably without maintenance-root false negatives.
- Pending-write state is deterministic across mixed outcomes and history reload.
- A dedicated live-wiki validator and route smoke workflow both exist and are runnable.
- Imported tasks can be triaged with an explicit review outcome rather than freeform conventions only.
Demo Targets
- Show a mixed chat proposal batch resolving with accurate post-action state.
- Show a failing and passing run of the wiki validator.
- Show the route smoke workflow producing artifacts.
- Show one imported task moving through the new review workflow.
Risk Register
- R-001: Chat-governance fixes may expose a deeper server-side pending-state model issue. Mitigation: keep
TSK-0017andTSK-0018separate enough to validate incrementally. - R-002: Browser smoke coverage can become flaky if it depends on unstable local timing. Mitigation: keep the smoke suite small and deterministic.
- R-003: A review-metadata feature can overcomplicate the task model if it jumps straight to a large schema change. Mitigation: start with the minimum queryable provenance/review contract.
Open Questions
- Should imported-task review state be a first-class field, a label convention, or a hybrid with dedicated UI affordances?
- Should remote hardening become startup-blocking when
AllowRemoteApi=truewithout an API key, or remain a strong readiness/admin gate? - What is the canonical source of truth for pending-write state: turn-local UI state, trace events, or a server-backed pending store?
Confidence
- Overall audit confidence: 89%
- Highest-confidence finding: imported backlog provenance and stale-task identification.
- Lowest-confidence finding: the exact implementation shape required for durable pending-write reconciliation.