feat: Add Seed Support for Reproducible Voice Generation (v1 & v2) by elismasilva · Pull Request #327 · OpenBMB/VoxCPM

elismasilva · 2026-06-01T19:04:55Z

Overview

This Pull Request introduces random seed (seed) support across the VoxCPM ecosystem (covering both v1 and v2 architectures). The primary goal is to enable reproducible speech synthesis, ensuring consistent voice timbre, style, and prosody across generation runs.

Key Changes

Core Models (VoxCPMModel & VoxCPM2Model)
- Exposed an optional seed parameter in both _generate and _generate_with_prompt_cache methods.
- Applied PyTorch CPU and CUDA random number generator seeds (torch.manual_seed and torch.cuda.manual_seed_all) right before the sampling loop in the core inference pipeline.
- Badcase Handling: If automatic badcase recovery is active (retry_badcase=True), the seed is dynamically incremented by +1 on each retry attempt to prevent deterministic infinite-loop generation failures.
- Exposed self.last_successful_seed as a model state attribute, enabling downstream user interfaces (e.g., Gradio, Streamlit) to easily retrieve and update fields with the exact seed that yielded the successful audio.
High-Level Pipeline Wrapper (VoxCPM in core.py)
- Propagated the seed argument through the public generate and generate_streaming API entry points.
Command Line Interface (cli.py) & Inference Scripts
- Added the --seed option to the design, clone, and batch subcommands, as well as legacy root CLI arguments.
- Integrated the --seed flag into both the full-finetune (test_voxcpm_ft_infer.py) and LoRA (test_voxcpm_lora_infer.py) inference test scripts.
Training & Validation
- Fixed a default evaluation seed (seed=42) during training inside generate_sample_audio. This ensures that periodic validation audio generated for TensorBoard shares the exact same initial acoustic noise path across steps, making qualitative progress evaluations reliable and scientifically objective.
Unit Tests
- Added new test cases to test_cli.py to verify that the command-line parser successfully parses the seed flag and accurately passes it to single-sample and batch generation tasks.
Documentation
- Updated Python API quick-start examples and CLI usage blocks in both the English (README.md) and Chinese (README_zh.md) documentations to showcase the usage of the new seed parameter.

How to Test

Python API:

wav = model.generate(
    text="This is a test of reproducible voice generation.",
    seed=42
)
print(f"Last seed: {model.tts_model.last_successful_seed}") #to capture current new seed in bad cases.

CLI Usage:

voxcpm design \
  --text "Reproducible speech synthesis with a fixed seed." \
  --seed 42 \
  --output out.wav

- Exposed 'seed' parameter in VoxCPMModel and VoxCPM2Model generation methods. - Added PyTorch RNG seed setting before inference runs. - Handled 'retry_badcase' seed adjustment by incrementing the seed value on retries. - Exposed 'self.last_successful_seed' as a model attribute for UI integrations. - Propagated 'seed' parameter to high-level pipeline class and CLI tools (cli.py). - Added '--seed' flag to full-finetune and LoRA inference scripts. - Configured validation audio generation in training script to use a fixed seed for objective comparison on TensorBoard. - Added comprehensive unit tests in CLI test files to validate seed parsing and propagation. - Updated English and Chinese READMEs with seed usage examples.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add Seed Support for Reproducible Voice Generation (v1 & v2)#327

feat: Add Seed Support for Reproducible Voice Generation (v1 & v2)#327
elismasilva wants to merge 1 commit into
OpenBMB:mainfrom
DEVAIEXP:feat-add-seed

elismasilva commented Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

elismasilva commented Jun 1, 2026

Overview

Key Changes

How to Test

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant