Random utilities solve two kinds of stability: the same seed should reproduce the same sequence, and changing dataset size should not reshuffle every existing split.
1. Core Idea
StableRNG is for tests, demos, benchmarks, and LLM-application fixtures where “random” data still needs to be debuggable. If a failure happens with seed 42, you should be able to regenerate the same rows, vectors, and samples later.
There are two stability patterns:
- Sequence stability: the same seed and the same draw sequence produce the same output. Use a
with StableRNG(seed=...)context when a sequence of draws should advance together. - Membership stability: adding more candidate items should not change the split result for every existing item. Use
hash_sample(...)andhash_split(...)for stable samples and partitions.
2. Generate a Reproducible Sequence
Create a generator with a seed and use it as a context when multiple draws belong to one sequence.3. Derive Child Streams
Usestep(...) to split one base seed into named streams without mutating the parent generator.
4. Generate Batches and Vectors
Most generators acceptn for a count or shape. Use this for fixtures that need many values at once.
rnd_vec(...) returns unit-length vectors. That makes it useful for LLM and vector-search development, where you often need embeddings-shaped data before wiring a real embedding provider into the example.
5. Sample Without Rewriting the World
Usehash_sample(...) and hash_split(...) when a stable sample should not depend on input order or on unrelated new records.
"p5", the decision for "p1" through "p4" is still based on each item’s hash and the seed, not on the original list position.
HeavenBase benchmarks use seeded generation so row content, samples, and vectors can be reproduced.

