RFC: Core Memory System Improvements for AI Agents

Status: Draft RFC, council-reviewed on 2026-05-20

Scope: Wiki and memory-system planning only; no implementation is implied by this page

Related: Search and Chat, Council Workflow, Deep Research Prompt, AI Memory Suite Implementation Plan, MemorySmith.Core/Docs/Plans/MemorySystemSchemaImprovements_20260519.md

1. Executive Summary

MemorySmith should improve agent memory, search, human-facing wiki docs, and chat-assisted learning without assuming that either extreme is correct:

The recommended direction is convention-first, validated, and evidence-gated. Use the current MemoryRecord shape first: Content, Tags, References, Conflicts, SourceLinks, Status, Confidence, UsageCount, and LastUpdated. Add lightweight conventions that humans can read and agents can retrieve. Then promote a convention into schema only after the wiki, search probes, and chat traces prove that the concept is durable and repeatedly useful.

This plan replaces the earlier “zero schema changes” framing. The better rule is: avoid schema churn, but do not avoid schema changes when they are the simplest long-term representation of a real capability.

Confidence in this revised direction: 82%. The biggest risks are manual convention drift, stale-ranking mistakes, and premature schema expansion.

2. Intended Use Cases

This plan optimizes for three audiences at once.

Audience Need Design implication
AI coding agents Compact, source-grounded truth that ranks well in MCP/search/context packs Keep structured memories atomic, tagged, sourced, and linked. Prefer JSON output for tools when an agent will parse it.
Human wiki readers Browseable explanations, decisions, runbooks, and learning paths Keep longer narrative in markdown pages. Link pages to structured memories when a fact needs source links or lifecycle metadata.
Chat users Ask questions, learn concepts, inspect evidence, and optionally approve Agent writes Chat should retrieve both memories and pages, show references/trace evidence, and explain when an answer depends on strict rules or stale records.

3. Current Ground Truth

Verified during review:

4. Design Principle: Conventions First, Schema When Proven

Conventions are useful when they are visible in the UI, searchable, and easy to correct. Schema is useful when a concept needs type safety, validation, query behavior, or UI controls.

Use conventions when:

Promote to schema when:

These are planning conventions, not yet enforcement rules. Do not assume code behavior exists until tests and implementation records prove it.

5.1 Tags

Keep normal flat tags for broad topics, such as project-wiki, chat, search, and current-state.

Use namespaced tags only for behavior-relevant hints. Prefer lowercase, colon-delimited tags without a leading # because current MemorySmith tags are stored as plain strings:

Tag Meaning Example
kind:rule A strict rule or invariant appears in the content kind:rule
kind:procedure The memory describes a repeatable workflow kind:procedure
priority:critical Agents should treat this as high priority when relevant priority:critical
review-after:YYYY-MM The record should be reviewed after a month review-after:2026-07
expires:YYYY-MM The record is invalid after a month unless renewed expires:2026-08
stale-risk:YYYY-MM The record may become stale but should not auto-expire stale-risk:2026-09
supersedes:<memory-id> This record replaces a known older record supersedes:project-wiki-old-search-plan
superseded-by:<memory-id> This record should defer to a newer record superseded-by:project-wiki-search-roadmap

Guardrails:

5.2 Strict Rules in Markdown Content

Use GFM alert blocks for constraints that humans and agents should notice:

> [!IMPORTANT]
> Keep `Data/Memories` stable. Tests that mutate wiki records must copy it to temp storage first.

Recommended content structure for durable memories:

## Rule
> [!IMPORTANT]
> One or two hard constraints, if any.

## Context
Short explanation of why the rule exists.

## Evidence
- Source link or related memory reference.

## Review Notes
- Review after: 2026-07
- Supersedes: old-memory-id, if applicable.

Do not assume GFM alert extraction is implemented. The implementation plan should use a Markdown-aware parser or heavily tested extraction, not an untested regular expression over arbitrary markdown.

5.3 Relationships

Today, References and Conflicts are plain memory ID arrays. Treat them as graph edges with simple meaning:

For now, typed relationship details should be written in Content under a Relationship Notes section and optionally mirrored in tags such as supersedes:<id>. A future schema may add a first-class Relations array if convention-based notes are too fragile.

Do not infer automatic conflict resolution merely because one record is newer or has higher confidence. That behavior must be explicit, tested, and visible in search/context-pack output before agents rely on it.

6. Search and Retrieval Improvements

6.1 Baseline Strategy

6.2 Pages as First-Class Knowledge

Pages are not just notes; they are the human-readable half of the wiki. The search roadmap should eventually treat pages as first-class retrieval units:

6.3 Staleness Before Decay

The old plan proposed a temporal decay formula. The safer order is:

  1. Add explicit staleness metadata as tags and visible warnings.
  2. Show staleness warnings in search/context-pack/chat trace output.
  3. Measure whether stale records are harming answers.
  4. Only then apply ranking changes.

Initial behavior should warn, not hide. Never silently bury Core rules, unresolved tasks, or high-confidence architecture decisions because they are old.

If ranking decay is later implemented, it should be bounded, reversible, and tested against the live project wiki. A draft scoring rule:

finalScore = baseSearchScore * confidenceMultiplier * freshnessHint * usageHint

Where:

7. Chat and Human Learning Improvements

Chat should be a learning surface, not only a text box over search.

Target behavior:

Good chat prompts for humans:

8. Governance for Agent Writes

Agent writes should remain opt-in and approval-gated. For memory quality, approval should consider more than whether the JSON is valid.

Suggested approval checklist:

Future UI/trace improvements may surface these checklist items directly in the Agent write approval panel.

9. Phased Plan

Phase 0: Decision Cleanup

Phase 1: Documentation and Convention Pilot

Phase 2: Validation Without Behavior Change

Phase 3: Retrieval Output Quality

Phase 4: Schema Promotion Decision

Promote conventions into schema only if Phase 1-3 evidence shows repeated need. Candidate fields:

Each promoted field needs migration, UI support, tests, and fallback behavior for legacy records.

Phase 5: Long-Term Capability Expansion

If maximizing usefulness requires broader changes, they should be considered openly rather than blocked by the lean premise:

These should remain evidence-driven additions, not automatic scope.

10. Acceptance Criteria Before Implementation

Do not implement ranking, schema, or write-behavior changes from this RFC until these are true:

11. Open Questions

Externally researchable parts of these questions have been converted into a reusable Deep Research Prompt. Use the research results as evidence for a later council review; do not treat external recommendations as automatic MemorySmith decisions.

12. Council Review Summary

Four review lenses were applied: architecture/data model, search/retrieval/MCP, human learning/chat UX, and skeptical risk.

Shared conclusions:

Overall confidence after review: 82% for the revised convention-first, evidence-gated approach; 55% for the original convention-only plan as written.