Aleph
Concepts

Prompt System

29-layer PromptBuilder, section caching, assembly paths, and thinking modes.

Overview

The prompt system is the canonical way Aleph assembles context for LLM calls. It lives in src/thinker/ and is architected around the PromptPipeline with 29 layers that inject content into the final system prompt string.

The sole public entry point is thinker::PromptBuilder, which wraps a PromptPipeline. The old agent_loop::PromptBuilder was removed during the Harness migration.

Architecture

FlowRequest


┌─────────────────────────────────────────┐
│           PromptBuilder                  │
│  ┌───────────────────────────────────┐  │
│  │        PromptPipeline              │  │
│  │  ┌─────┐ ┌─────┐ ┌─────┐ ┌─────┐ │  │
│  │  │Layer│→│Layer│→│Layer│→│ ...│ │  │
│  │  │  1  │ │  2  │ │  3  │ │ 29 │ │  │
│  │  └─────┘ └─────┘ └─────┘ └─────┘ │  │
│  └───────────────────────────────────┘  │
└─────────────────────────────────────────┘


System Prompt String

Layer System

Each layer implements the PromptLayer trait:

pub trait PromptLayer: Send + Sync {
    fn name(&self) -> &'static str;
    fn priority(&self) -> u32;
    fn paths(&self) -> &'static [AssemblyPath];
    fn inject(&self, output: &mut String, input: &LayerInput);
}

The PromptPipeline executes layers in ascending priority order (lowest number first). Each layer inspects the LayerInput context and appends its content to the output string.

29 Layers (sorted by priority)

Stable zone — content rarely changes, eligible for section-level caching (priorities 50–1600):

PriorityLayerNotes
50SoulLayerIdentity / personality
55AgentRoleLayerSub-agent role header + protocol blocks
75ProfileLayerWorkspace profile overlay
100RoleLayerBase assistant role
300EnvironmentLayerOS, date, working directory
400RuntimeCapabilitiesLayerPython, Node.js, FFmpeg, etc.
500ToolsLayerTool definitions (text schema)
501HydratedToolsLayerSemantic-retrieval tool definitions
550ToolUsageGrammarLayerData-driven tool usage conventions
600SecurityLayerSafety / security guidelines
700ProtocolTokensLayerJSON-RPC protocol tokens
710HeartbeatLayerSession keep-alive instructions
800OperationalGuidelinesLayerOperational rules
900CitationStandardsLayerCitation formatting
1000GenerationModelsLayerAvailable image/video/audio models
1050SkillInstructionsLayerActive skill instructions
1100SpecialActionsLayerSpecial action syntax
1200ResponseFormatLayerResponse structure
1300GuidelinesLayerGeneral guidelines
1350ThinkingGuidanceLayerStructured reasoning guidance
1400SkillModeLayerStrict skill workflow enforcement
1500CustomInstructionsLayerUser custom instructions
1600LanguageLayerResponse language

Dynamic zone — per-request, never cached (priorities 1700–1750):

PriorityLayerNotes
1700InboundContextLayerSender, channel, session metadata
1710VoiceModeLayerVoice-specific response instructions
1720RuntimeContextLayerCurrent time, session info
1730IdentityFilesLayerSOUL.md, IDENTITY.md workspace files
1740MemoryAugmentationLayerDual-path memory injection
1750SessionContextGuideLayerCompressed session context guidance

Assembly Paths

Each layer declares which AssemblyPath values it participates in. The pipeline filters layers at assembly time based on the selected path:

PathDescription
BasicMinimal system prompt — config + tool list only
HydrationTools come from semantic retrieval (HydrationResult)
SoulSoul-enriched — includes identity / personality layers
ContextContext-aware — uses ResolvedContext
CachedPre-cached stable prefix

The default path for most calls is Soul, which includes identity, personality, and the full tool suite.

Section Caching

PromptPipeline caches the output of LayerStability::Stable layers after the first call. Dynamic layers always recompute. ~23 of 29 layers are Stable.

pipeline.invalidate("soul");    // Invalidate one layer by name
pipeline.invalidate_all();      // Clear all cached sections
pipeline.cache_stats();         // CacheStats { hits, misses, entries }

Cache invalidation triggers:

  • Tool list change → invalidate "tools" and "hydrated_tools"
  • Soul change → invalidate "soul"
  • Session reset → invalidate_all()

Stable vs Dynamic Zones

ZoneLayersBehavior
StablePriorities 50–1600Cached across turns; recomputed only on invalidation
DynamicPriorities 1700–1750Recomputed every request; never cached

Thinking Modes

PromptMode controls which layers participate during assembly:

ModeBehavior
Full (default)All 29 layers participate
CompactExcludes 14 heavy layers (runtime context, environment, runtime capabilities, protocol tokens, heartbeat, operational guidelines, citation standards, generation models, skill instructions, special actions, guidelines, thinking guidance, skill mode, custom instructions)
MinimalOnly 5 core layers: soul, tools, hydrated tools, response format, language

Use Compact for high-throughput scenarios where latency matters. Use Minimal for tool-only calls or when token budget is severely constrained.

Configuration

The prompt system is configured via aleph.toml:

[thinker]
# Default mode for system prompt assembly
# Options: "full", "compact", "minimal"
default_mode = "full"

# Maximum total characters in assembled prompt
max_total_chars = 80000

# Protected priorities that survive budget enforcement
protected_priorities = [50, 55, 75, 100, 500, 501, 1200]

[thinker.caching]
# Enable section-level caching for stable layers
enabled = true

[thinker.memory]
# Path to workspace MEMORY.md (relative to workspace root)
memory_file = ".aleph/MEMORY.md"
# Maximum lines to read from MEMORY.md
max_memory_lines = 200
# Maximum characters for memory injection
max_memory_chars = 20000

Key Layers

AgentRoleLayer

Replaces the old prompt_sections::resolve() function. When LayerInput.agent_def is set, injects role headers and protocol blocks from AgentDef.prompt_sections (e.g. explore_constraints, coder_guidelines, researcher_protocol).

ToolUsageGrammarLayer

Reads ToolInfo.usage_hint fields (prefer_for, prefer_over) and generates data-driven guidelines like "use file_read instead of shell cat". All conventions come from tool definitions — no hardcoded rules.

MemoryAugmentationLayer

Dual-path memory injection:

  1. Structured index — reads .aleph/MEMORY.md, truncated to 200 lines
  2. Vector retrieval — top-K semantic search results from sqlite-vec
  3. Wikilink graph — Obsidian-compatible [[note]] links form a traversable knowledge graph
  • Thinker — LLM interaction and streaming
  • Harness — Think→Act loop execution
  • Memory — RAG, facts, and retrieval

On this page