Markdown Parsing
Markdown code fence parsing and text formatting utilities for streaming output processing.
The markdown and utils modules provide parsing utilities for Markdown content and general-purpose text formatting functions used across the codebase.
Design Philosophy
- Streaming-friendly — Fence parsing handles incomplete Markdown from streaming LLM output
- Lightweight — Minimal dependencies, focused scope
- UTF-8 safe — All string operations respect character boundaries
Markdown Fence Parsing
The markdown::fences module parses code fence blocks in Markdown:
pub struct FenceSpan {
pub start_line: usize,
pub end_line: usize,
pub fence_char: char, // '`' or '~'
pub info_string: String,
}Parsing Functions
pub fn parse_fence_spans(text: &str) -> Vec<FenceSpan>Parses all code fence blocks in the text, handling:
- Backtick fences (```)
- Tilde fences (~~~)
- Info strings (language identifiers)
- Nested fences (longer closing fence than opening)
- Unclosed fences (extends to end of text)
pub fn find_fence_at(
text: &str,
line: usize,
) -> Option<&FenceSpan>Finds the fence span containing a specific line.
pub fn is_safe_fence_break(
text: &str,
pos: usize,
) -> boolChecks whether it's safe to break the text at a given position (not inside a fence).
pub fn get_fence_split(
text: &str,
) -> FenceSplitSplits text at a safe boundary, preferring fence boundaries.
Text Formatting
The utils::text_format module provides text manipulation utilities:
pub fn truncate_text(
text: &str,
max_chars: usize,
) -> StringTruncates text to a maximum character count using char_indices().nth() for UTF-8 safety.
JSON Extraction
The utils::json_extract module extracts JSON from text:
pub fn extract_json_objects(
text: &str,
) -> Vec<&str>Finds top-level JSON objects in arbitrary text (useful for parsing LLM output that may contain Markdown + JSON).
Path Utilities
The utils::paths module provides path manipulation:
pub fn get_agent_config_dir(
agent_id: &str,
) -> Result<PathBuf>Returns the configuration directory for a specific agent, with validation against path traversal (/, \, .., empty strings).
pub fn expand_tilde(
path: &str,
) -> PathBufExpands ~ to the user's home directory.
PII Scrubbing
The utils::pii module provides log-safe PII scrubbing (less strict than the gateway pii module):
pub fn scrub_pii(text: &str) -> StringAccepts false positives (safe for logs, not for LLM API calls).
OneOrMany
The utils::one_or_many module handles serialization of single values or arrays:
pub enum OneOrMany<T> {
One(T),
Many(Vec<T>),
}Useful for deserializing config fields that can be either "value" or ["value1", "value2"].
Safety Properties
- UTF-8 safe —
char_indices()for truncation,.get(..n)for slicing - No lock issues — No Mutex/RwLock in these modules
- No
static mut— UsesOnceLockandLazyLock - Path traversal protection —
get_agent_config_dirvalidates against..and separators - TOCTOU safety —
create_dir_allcalled directly (idempotent, no existence check)
Code Location
Markdown:
src/markdown/mod.rs— Module entry pointsrc/markdown/fences.rs— Fence parsing
Utils:
src/utils/mod.rs— Module entry pointsrc/utils/text_format.rs— Text truncationsrc/utils/json_extract.rs— JSON extractionsrc/utils/paths.rs— Path utilitiessrc/utils/pii.rs— Log-safe PII scrubbingsrc/utils/one_or_many.rs— Single/array serialization
See Also
- Builtin Tools — Tools that use markdown parsing
- PII Protection — Gateway-level PII filtering