Testing
Test framework, test categories, running tests, mocking patterns, and CI pipeline.
Aleph uses the standard Rust testing ecosystem — cargo test as the runner, with additional frameworks for BDD-style integration tests and benchmarks. This page covers how to run tests, how they are organized, and how to write new ones.
Test Framework Stack
| Tool | Purpose |
|---|---|
cargo test | Built-in test runner for unit and integration tests |
criterion | Statistical benchmarking (with async Tokio support) |
cucumber | BDD-style feature tests (Gherkin syntax) |
tempfile | Temporary directories for test isolation |
serial_test | Serialize tests that share global state |
filetime | File timestamp manipulation for watcher tests |
These are declared in the core/Cargo.toml dev-dependencies:
[dev-dependencies]
criterion = { version = "0.5", features = ["html_reports", "async_tokio"] }
cucumber = { version = "0.21", features = ["macros"] }
tempfile = "3.8"
filetime = "0.2"
serial_test = "3.0"
aleph-sdk = { path = "../apps/shared" }
aleph-protocol = { path = "../shared/protocol" }Test Categories
Unit Tests
Unit tests live inside the source file they test, in a #[cfg(test)] module:
// src/memory/context.rs
pub fn normalize_fact(raw: &str) -> String {
raw.trim().to_lowercase()
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_normalize_fact_trims_whitespace() {
assert_eq!(normalize_fact(" hello "), "hello");
}
#[test]
fn test_normalize_fact_lowercases() {
assert_eq!(normalize_fact("Hello World"), "hello world");
}
}Async unit tests use the tokio::test macro:
#[cfg(test)]
mod tests {
use super::*;
#[tokio::test]
async fn test_session_creation() {
let manager = SessionManager::new_test().await;
let session = manager.create("test-user").await.unwrap();
assert!(!session.id.is_empty());
}
}Integration Tests
Integration tests live in core/tests/ and test interactions across modules. They compile as separate crates, so they can only use the public API of alephcore.
core/tests/
├── cucumber.rs # BDD test runner (Gherkin features)
├── features/ # .feature files for Cucumber tests
└── ...The Cucumber test runner is configured as a custom test harness:
[[test]]
name = "cucumber"
harness = falseDoc Tests
Documentation examples in /// comments are compiled and run as tests:
/// Parses a task status from a string.
///
/// # Examples
///
/// ```
/// use alephcore::poe::types::TaskStatus;
///
/// let status: TaskStatus = "pending".parse().unwrap();
/// assert_eq!(status, TaskStatus::Pending);
/// ```
impl FromStr for TaskStatus {
// ...
}Benchmarks
Criterion benchmarks live in core/benches/ and are declared in Cargo.toml:
[[bench]]
name = "memory_benchmarks_simple"
harness = false
[[bench]]
name = "approval_performance"
harness = falseRunning Tests
All Tests
# Run all tests in the workspace
cargo test --workspace
# Run only tests in the core crate
cargo test -p alephcore
# Run tests in the CLI crate
cargo test -p aleph-cliSpecific Tests
# Run a single test by name
cargo test test_normalize_fact_trims_whitespace
# Run all tests in a module
cargo test memory::context::tests
# Run tests matching a pattern
cargo test session
# Run a specific integration test file
cargo test --test cucumberTest Options
# Show stdout from passing tests (normally suppressed)
cargo test -- --nocapture
# Run tests sequentially (useful for debugging)
cargo test -- --test-threads=1
# List all test names without running them
cargo test -- --list
# Run ignored tests (typically slow or requiring external services)
cargo test -- --ignoredBenchmarks
# Run all benchmarks
cargo bench
# Run a specific benchmark
cargo bench --bench memory_benchmarks_simple
# Benchmark with HTML report
cargo bench -- --output-format=bencherCriterion writes HTML reports to target/criterion/.
Feature-Gated Tests
Some tests require specific features to be enabled:
# Run tests that need the test-helpers feature
cargo test --features test-helpers
# Run tests for the gateway feature
cargo test --features gateway
# Run the full test suite with all features
cargo test --all-featuresTest Fixtures and Helpers
Temporary Directories
Use tempfile for tests that create files or databases:
use tempfile::TempDir;
#[tokio::test]
async fn test_database_persistence() {
let tmp = TempDir::new().unwrap();
let db_path = tmp.path().join("test.db");
let store = SqliteStore::open(&db_path).await.unwrap();
store.insert_fact("test fact").await.unwrap();
// tmp is dropped at the end, cleaning up automatically
}Serial Tests
Tests that share global state (environment variables, singleton registries) use the serial_test crate:
use serial_test::serial;
#[test]
#[serial]
fn test_config_from_env() {
std::env::set_var("ANTHROPIC_API_KEY", "test-key");
let config = Config::from_env().unwrap();
assert_eq!(config.api_key, "test-key");
std::env::remove_var("ANTHROPIC_API_KEY");
}Test Configuration
Tests that need a valid Config object without real API keys:
fn test_config() -> Config {
Config {
api_key: "test-key-not-real".into(),
base_url: "http://localhost:9999".into(),
model: "test-model".into(),
..Config::default()
}
}Mocking Patterns
Aleph primarily uses trait-based mocking — defining traits for external boundaries and providing test implementations.
Trait-Based Test Doubles
// Production trait
#[async_trait]
pub trait MemoryStore: Send + Sync {
async fn store_fact(&self, fact: MemoryFact) -> Result<()>;
async fn search(&self, query: &str, limit: usize) -> Result<Vec<MemoryFact>>;
}
// Test implementation
#[cfg(test)]
pub struct MockMemoryStore {
facts: std::sync::Mutex<Vec<MemoryFact>>,
}
#[cfg(test)]
#[async_trait]
impl MemoryStore for MockMemoryStore {
async fn store_fact(&self, fact: MemoryFact) -> Result<()> {
self.facts.lock().unwrap().push(fact);
Ok(())
}
async fn search(&self, query: &str, limit: usize) -> Result<Vec<MemoryFact>> {
let facts = self.facts.lock().unwrap();
Ok(facts.iter()
.filter(|f| f.content.contains(query))
.take(limit)
.cloned()
.collect())
}
}In-Process Server for Integration Tests
For gateway integration tests, spin up a real server in the test process:
#[tokio::test]
async fn test_gateway_roundtrip() {
// Start server on a random port
let server = TestServer::start().await;
let addr = server.addr();
// Connect a client
let client = AlephClient::connect(&format!("ws://{addr}")).await.unwrap();
// Send a request
let response = client.send_message("hello").await.unwrap();
assert!(response.is_ok());
}CI Pipeline
Recommended CI Steps
A typical CI workflow for Aleph:
# .github/workflows/ci.yml
name: CI
on: [push, pull_request]
jobs:
check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
# Format check
- run: cargo fmt --all -- --check
# Lint
- run: cargo clippy --workspace --all-targets -- -D warnings
# Type check (faster than full build)
- run: cargo check --workspace
# Tests
- run: cargo test --workspace
# Doc tests
- run: cargo doc --workspace --no-depsKey CI Checks
| Check | Command | Purpose |
|---|---|---|
| Format | cargo fmt --all -- --check | Consistent code formatting |
| Lint | cargo clippy --workspace -- -D warnings | Catch common mistakes |
| Type check | cargo check --workspace | Fast compilation verification |
| Unit tests | cargo test --workspace | Correctness verification |
| Doc build | cargo doc --workspace --no-deps | Ensure doc comments compile |
Test Coverage
Generating Coverage Reports
Use cargo-llvm-cov for coverage reports:
# Install
cargo install cargo-llvm-cov
# Generate HTML report
cargo llvm-cov --workspace --html
# Open the report
open target/llvm-cov/html/index.html
# Generate LCOV format (for CI integration)
cargo llvm-cov --workspace --lcov --output-path coverage.lcovCoverage Guidelines
Aleph does not enforce a strict coverage percentage, but follows these principles:
- Core logic (agent loop, memory, tools) should have thorough unit tests
- Parsing and serialization (FromStr, Serialize/Deserialize) should test round-trips
- Error paths should be tested as thoroughly as happy paths
- Integration boundaries (database, network) should have integration tests with real or in-process backends
- UI code (Control Plane) is tested manually — WASM test tooling is limited
Writing Good Tests
Naming Convention
Test names describe the scenario and expected outcome:
#[test]
fn parse_task_status_returns_pending_for_lowercase_input() { ... }
#[test]
fn session_manager_rejects_expired_tokens() { ... }
#[test]
fn memory_search_returns_empty_vec_when_no_facts_match() { ... }Test Structure (Arrange-Act-Assert)
#[tokio::test]
async fn memory_store_returns_facts_sorted_by_relevance() {
// Arrange
let store = setup_test_store().await;
store.insert(fact("Rust is a systems language")).await.unwrap();
store.insert(fact("Python is interpreted")).await.unwrap();
// Act
let results = store.search("Rust programming", 10).await.unwrap();
// Assert
assert!(!results.is_empty());
assert!(results[0].content.contains("Rust"));
}Avoid Test Anti-Patterns
- Do not assert on exact string equality for AI-generated output
- Do not depend on network availability in unit tests
- Do not share mutable state across tests without
serial_test - Do not use
sleepfor synchronization — use channels or condition variables