Public, minimal extract of the Chronoxide (OTLP-native TSDB) workbench, focused on SymbolTable / string interning performance and memory behavior under high-cardinality label workloads.
This repo exists to accompany the "Arc<str> vs arena interning" write-up and to provide a small, reproducible codebase with:
- ArcSymbolTable: baseline
HashMap<Arc<str>, SymbolId>interner. - ArenaSymbolTable: arena-backed interner (single
Vec<u8>+(offset,len)per symbol). - TrackingAllocator: a cross-platform global allocator wrapper that tracks requested vs usable allocation sizes to approximate allocator internal fragmentation.
chronoxide-core/src/labels/symbol_table.rs:ArcSymbolTable+ArenaSymbolTablesrc/alloc_tracking.rs:TrackingAllocatorbenches/symbol_table.rs: Criterion benchmark comparing intern/lookup/resolveexamples/symbol_table_memory.rs: memory + fragmentation experiment usingTrackingAllocator
- Rust
1.92.0+(workspacerust-version) - No external services required (synthetic dataset generation)
cargo bench -p chronoxide-core --bench symbol_table -- --warm-up-time 5 --sample-size 400The benchmark prints:
- wall-clock timings for
intern/lookup/resolve - best-effort size estimates (
estimate_allocated_bytes,estimate_used_bytes)
cargo run --release -p chronoxide-core --example symbol_table_memory -- 512 25000 75000Arguments:
512: number of unique “keys” generated (plus__name__)25000: number of “common” values (reused frequently)75000: number of “rare” values (high-cardinality / mostly-new)
The output includes:
req_current: total bytes requested by allocations still liveusable_current: total bytes actually reserved by the allocator (includes rounding)internal_frag:usable_current - req_currentand percentage- allocation call counts (
alloc_calls,realloc_calls)
The synthetic string generator is tuned to resemble real OTLP label workloads we observed during a 11M-message ingestion run:
- short keys (≈10–20 bytes typical, max ≈71)
- short-to-medium values with a long tail (up to ≈2048, P99 of Value Len Max/Series is 193)