Local Fine-Tune Harness

MemorySmith includes a local fine-tuning pipeline for training custom Ollama models using chat transcripts and feedback data.

Overview

The harness enables a data flywheel: chat interactions generate training data, feedback ratings identify high-quality examples, and the fine-tune pipeline produces a specialized model that improves future chat quality.

Components

Data Plane (C#)

Orchestration (C#)

Execution (Python)

Configuration

Settings under MemorySmith:Training:

Setting Default Description
ChatTranscriptEnabled false Enable chat transcript capture
StoreChatContent false Store full message content (not just metadata)
TranscriptRedactionEnabled true Redact secrets in stored transcripts
TranscriptRetentionDays 90 Auto-delete transcripts older than N days
FeedbackEnabled false Enable thumbs up/down feedback UI
MaxRunMinutes 360 Training run timeout
PreferenceFormat FilteredSft Export format (FilteredSft, Dpo, Orpo)

Getting Started

  1. Enable transcript capture: set Training:ChatTranscriptEnabled=true and Training:StoreChatContent=true in admin settings
  2. Use the chat normally — transcripts accumulate in Data/Events/chat-transcripts/
  3. Rate responses with thumbs up/down to build a feedback signal
  4. When ready, visit /training-workbench (requires Admin role)
  5. Click "Start training run" (or "Start dry run" for a simulated test)

The harness automatically detects whether real training is possible (CUDA + dependencies) and falls back to simulated mode otherwise.

Requirements for Real Training

Use Scripts/Test-FinetuneHarnessPrereqs.ps1 to check readiness.