Pack System
What Are Packs?
Packs specialize Working Mind for a domain. A pack is a directory containing a system prompt, tool declarations, skills, personas, and commands. The core agent is pack-agnostic -- it knows nothing about any domain. Packs define the domain.
The Starter Pack
The only pack that ships with Working Mind today. It provides:
- Persistent memory -- built-in SQLite knowledge graph with FTS5 search, contradiction detection, and temporal validity
- Web search -- via Brave Search MCP server (requires
BRAVE_API_KEY) - Web scraping -- via Firecrawl MCP server (requires
FIRECRAWL_API_KEY) - Session summary and export -- built-in curation prompts
The starter pack works immediately. Memory works without any API keys or MCP servers. Search and scraping are optional -- add keys when you're ready.
Pack Anatomy
my-pack/
pack.json -- name, version, MCP server declarations, settings
prompt.md -- the domain expert system prompt
skills/ -- reusable subroutines (step-by-step workflows)
commands/ -- slash command entry points (markdown files)
personas/ -- execution modes (role-specific behaviors)
curation/ -- output templates (summarize + export formats)
Every file is human-readable. The directory is forkable (copy it, edit prompt.md), versionable (pack.json has a version field), distributable (push to git), and composable (load multiple packs).
Pack Sections
Each part of the pack serves a specific purpose:
| Section | Purpose | See |
|---|---|---|
prompt.md | Domain expert system prompt | Pack Prompt |
personas/ | Execution modes with tool filters | Personas |
skills/ | Reusable multi-step subroutines | Skills |
commands/ | Slash command entry points | Pack Commands |
curation/ | Output and export templates | Curation |
pack.json
{
"name": "my-pack",
"version": "0.1.0",
"description": "A pack for my domain",
"prompt": "prompt.md",
"mcpServers": {
"brave-search": {
"package": "@brave/brave-search-mcp-server",
"required": false
}
},
"personas": {
"analyst": { "prompt": "personas/analyst.md" }
},
"skills": {
"deep-research": { "instructions": "skills/deep-research.md" }
}
}
Loading Packs
# Load the default starter pack
wmind
# Load a specific pack
wmind --pack researcher
# Load a pack from a local directory
wmind --pack ./my-custom-pack
# Load multiple packs
wmind --pack starter --pack my-custom-pack
Building a Pack
- Copy the starter pack -- it's in
packs/starter/of the Working Mind repository - Edit
prompt.md-- write a domain expert system prompt - Add MCP servers -- declare them in
pack.jsonundermcpServers - Add skills -- create markdown files in
skills/ - Add personas -- create markdown files in
personas/ - Test -- run
wmind --pack ./path/to/my-pack
Why Packs Work: The Expert-Curated Advantage
The pack system is built on a specific hypothesis: domain experts who curate their own knowledge graphs produce the highest quality training data that exists. Not synthetic data. Not crowdsourced annotations. Not GPT-4-generated pairs. Expert-curated, iterated-to-correctness, graph-structured knowledge.
The Loop
- You are a domain expert. You know your field.
- You instruct Working Mind. It saves what you teach it.
- You ask questions. It answers from the graph. You know immediately if the answer is right.
- If it's wrong, you fix the graph. The next answer is right.
- When the answers are consistently perfect, the graph is done.
That graph -- verified by the only person qualified to verify it -- is the single best source of domain training data. Every entity was explicitly created. Every observation was verified. Every relation was intentional. No other data pipeline produces this quality.
From Graph to Fine-Tuned Model
The pack hypothesis extends further: if the graph produces perfect answers for a domain expert, it contains enough structure to fine-tune a small model that answers those same questions without the graph.
Research supports this:
| Evidence | Source |
|---|---|
| Break-even at ~100 labeled samples | Multiple papers |
| OPT-350M beats ChatGPT on tool calling (+3x) | arxiv:2305.15044 |
| Lawma-8B beats Claude 3.7 Sonnet on legal (+9pp) | arxiv:2501.14013 |
| 500 curated examples > 5,000 auto-generated | Consistent across fine-tuning literature |
| KG data needs no GPT-4 quality check -- it is already curated | KG2Tool (arxiv:2506.21071) |
The key insight: shit in, shit out cannot be prevented by any pipeline. But a domain expert who iterates until the graph is right eliminates the problem at the source. The data quality is solved by definition -- not by validation, not by filtering, not by synthetic quality checks, but by the expert who built it.
When your graph produces perfect answers, you export it. The .oexp format carries the full graph, taxonomy, persona lenses, and readiness assessment. From there, instruction pairs are generated and a small model is fine-tuned. The result: a domain champion that knows your field better than any frontier model -- because it was trained on knowledge you verified yourself.
What We Know and What's Next
What works now:
- The pack format loads correctly
- MCP servers wire up as declared
- Skills activate and inject instructions
- Personas change agent behavior
- Domain-specific packs measurably improve tool selection and task completion
- Expert users who iterate get consistent, correct answers from their graphs
What's in development:
- The graph-to-fine-tune pipeline (
.oexpexport + instruction pair generation) - Persona-driven export (same graph, multiple training data flavors)
- Readiness checks (is your graph deep enough for fine-tuning?)
The path:
- Today: build your graph, verify your answers, compound your knowledge
- Next: export your graph as training data
- Future: fine-tune a domain champion from your expertise
Research Context
The pack architecture draws on and aligns with several active research threads:
| Paper | Connection to Packs |
|---|---|
| SoK: Agentic Skills (Jiang et al. 2026) | Maps the full lifecycle of agentic skills. Packs implement several of their seven design patterns: metadata-driven disclosure, natural-language skills, and git-based distribution. Their security analysis validates our decision to keep packs local-only. |
| Adaptation of Agentic AI (Jiang et al. 2025) | Four-paradigm framework for agent adaptation. Packs are primarily T1 (reusable, agent-agnostic modules). Persona-driven tool filtering moves toward T2 (agent-supervised modules). |
| Agentic Proposing (Jiao et al. 2026) | Compositional skill synthesis outperforms monolithic approaches. Pack skills implement this: each skill is a composable subroutine the agent activates on demand. |
| MACRO (Fan et al. 2026) | Agents that discover and register composite tools from execution trajectories outperform static tool sets. Packs should evolve from declarative to self-improving. |
| EvoTest (He et al. 2025) | Evolves the entire agent configuration after every episode. Packs are a static snapshot of this -- future versions could auto-tune based on session outcomes. |
| xLAM (Zhang et al. 2024) | Purpose-trained action models that outperform GPT-4 on tool use. Packs take the opposite approach: configure the environment, not the model. Both paths are complementary. |
| UniToolCall (Liang et al. 2026) | Fine-tuned 8B models achieve 93% tool-call precision with unified training data. Packs provide the domain-specific tool schemas that make such fine-tuning possible. |
| KG2Tool (arxiv 2506.21071) | KG-constructed data does not need quality checking because KGs are already curated. Validates our expert-curation-first approach. |