Pack System

How packs specialize Working Mind for your domain -- and why expert-curated knowledge is the most valuable training data in the world.

What Are Packs?

Packs specialize Working Mind for a domain. A pack is a directory containing a system prompt, tool declarations, skills, personas, and commands. The core agent is pack-agnostic -- it knows nothing about any domain. Packs define the domain.

The Starter Pack

The only pack that ships with Working Mind today. It provides:

  • Persistent memory -- built-in SQLite knowledge graph with FTS5 search, contradiction detection, and temporal validity
  • Web search -- via Brave Search MCP server (requires BRAVE_API_KEY)
  • Web scraping -- via Firecrawl MCP server (requires FIRECRAWL_API_KEY)
  • Session summary and export -- built-in curation prompts

The starter pack works immediately. Memory works without any API keys or MCP servers. Search and scraping are optional -- add keys when you're ready.

Pack Anatomy

my-pack/
  pack.json       -- name, version, MCP server declarations, settings
  prompt.md       -- the domain expert system prompt
  skills/         -- reusable subroutines (step-by-step workflows)
  commands/       -- slash command entry points (markdown files)
  personas/       -- execution modes (role-specific behaviors)
  curation/       -- output templates (summarize + export formats)

Every file is human-readable. The directory is forkable (copy it, edit prompt.md), versionable (pack.json has a version field), distributable (push to git), and composable (load multiple packs).

Pack Sections

Each part of the pack serves a specific purpose:

SectionPurposeSee
prompt.mdDomain expert system promptPack Prompt
personas/Execution modes with tool filtersPersonas
skills/Reusable multi-step subroutinesSkills
commands/Slash command entry pointsPack Commands
curation/Output and export templatesCuration

pack.json

{
  "name": "my-pack",
  "version": "0.1.0",
  "description": "A pack for my domain",
  "prompt": "prompt.md",
  "mcpServers": {
    "brave-search": {
      "package": "@brave/brave-search-mcp-server",
      "required": false
    }
  },
  "personas": {
    "analyst": { "prompt": "personas/analyst.md" }
  },
  "skills": {
    "deep-research": { "instructions": "skills/deep-research.md" }
  }
}

Loading Packs

# Load the default starter pack
wmind

# Load a specific pack
wmind --pack researcher

# Load a pack from a local directory
wmind --pack ./my-custom-pack

# Load multiple packs
wmind --pack starter --pack my-custom-pack

Building a Pack

  1. Copy the starter pack -- it's in packs/starter/ of the Working Mind repository
  2. Edit prompt.md -- write a domain expert system prompt
  3. Add MCP servers -- declare them in pack.json under mcpServers
  4. Add skills -- create markdown files in skills/
  5. Add personas -- create markdown files in personas/
  6. Test -- run wmind --pack ./path/to/my-pack

Why Packs Work: The Expert-Curated Advantage

The pack system is built on a specific hypothesis: domain experts who curate their own knowledge graphs produce the highest quality training data that exists. Not synthetic data. Not crowdsourced annotations. Not GPT-4-generated pairs. Expert-curated, iterated-to-correctness, graph-structured knowledge.

The Loop

  1. You are a domain expert. You know your field.
  2. You instruct Working Mind. It saves what you teach it.
  3. You ask questions. It answers from the graph. You know immediately if the answer is right.
  4. If it's wrong, you fix the graph. The next answer is right.
  5. When the answers are consistently perfect, the graph is done.

That graph -- verified by the only person qualified to verify it -- is the single best source of domain training data. Every entity was explicitly created. Every observation was verified. Every relation was intentional. No other data pipeline produces this quality.

From Graph to Fine-Tuned Model

The pack hypothesis extends further: if the graph produces perfect answers for a domain expert, it contains enough structure to fine-tune a small model that answers those same questions without the graph.

Research supports this:

EvidenceSource
Break-even at ~100 labeled samplesMultiple papers
OPT-350M beats ChatGPT on tool calling (+3x)arxiv:2305.15044
Lawma-8B beats Claude 3.7 Sonnet on legal (+9pp)arxiv:2501.14013
500 curated examples > 5,000 auto-generatedConsistent across fine-tuning literature
KG data needs no GPT-4 quality check -- it is already curatedKG2Tool (arxiv:2506.21071)

The key insight: shit in, shit out cannot be prevented by any pipeline. But a domain expert who iterates until the graph is right eliminates the problem at the source. The data quality is solved by definition -- not by validation, not by filtering, not by synthetic quality checks, but by the expert who built it.

When your graph produces perfect answers, you export it. The .oexp format carries the full graph, taxonomy, persona lenses, and readiness assessment. From there, instruction pairs are generated and a small model is fine-tuned. The result: a domain champion that knows your field better than any frontier model -- because it was trained on knowledge you verified yourself.

What We Know and What's Next

What works now:

  • The pack format loads correctly
  • MCP servers wire up as declared
  • Skills activate and inject instructions
  • Personas change agent behavior
  • Domain-specific packs measurably improve tool selection and task completion
  • Expert users who iterate get consistent, correct answers from their graphs

What's in development:

  • The graph-to-fine-tune pipeline (.oexp export + instruction pair generation)
  • Persona-driven export (same graph, multiple training data flavors)
  • Readiness checks (is your graph deep enough for fine-tuning?)

The path:

  • Today: build your graph, verify your answers, compound your knowledge
  • Next: export your graph as training data
  • Future: fine-tune a domain champion from your expertise

Research Context

The pack architecture draws on and aligns with several active research threads:

PaperConnection to Packs
SoK: Agentic Skills (Jiang et al. 2026)Maps the full lifecycle of agentic skills. Packs implement several of their seven design patterns: metadata-driven disclosure, natural-language skills, and git-based distribution. Their security analysis validates our decision to keep packs local-only.
Adaptation of Agentic AI (Jiang et al. 2025)Four-paradigm framework for agent adaptation. Packs are primarily T1 (reusable, agent-agnostic modules). Persona-driven tool filtering moves toward T2 (agent-supervised modules).
Agentic Proposing (Jiao et al. 2026)Compositional skill synthesis outperforms monolithic approaches. Pack skills implement this: each skill is a composable subroutine the agent activates on demand.
MACRO (Fan et al. 2026)Agents that discover and register composite tools from execution trajectories outperform static tool sets. Packs should evolve from declarative to self-improving.
EvoTest (He et al. 2025)Evolves the entire agent configuration after every episode. Packs are a static snapshot of this -- future versions could auto-tune based on session outcomes.
xLAM (Zhang et al. 2024)Purpose-trained action models that outperform GPT-4 on tool use. Packs take the opposite approach: configure the environment, not the model. Both paths are complementary.
UniToolCall (Liang et al. 2026)Fine-tuned 8B models achieve 93% tool-call precision with unified training data. Packs provide the domain-specific tool schemas that make such fine-tuning possible.
KG2Tool (arxiv 2506.21071)KG-constructed data does not need quality checking because KGs are already curated. Validates our expert-curation-first approach.