Configuration Parsing Warning: Config file tokenizer_config.json cannot be fetched (too big)
Multilingual-TTS-4B-Base
Continue pretraining Qwen/Qwen3-4B-Base on Multilingual Voice Conversion and TTS.
- Use neucodec as speech detokenizer, 50 TPS, output in 24k sample rate.
- Multi-speaker multilingual Voice Conversion, up to 35.88B tokens.
- Multi-speaker multilingual TTS more than 150 languages, up to 14.64B tokens.
- Flash Attention 3 10k context length varlen multipacking.
- BF16 training.
- MuonAdamW optimizer.
- Downloads last month
- 13