Sopro TTS: A lightweight text-to-speech model with zero-shot voice cloning.
This tool only requires 3-12 seconds of audio to do voice cloning. And itโs not even a very large model. The Reddit โwisdomโ is that voice cloning takes hours of audio and/or is โimpossible.โ
And, sure, voice cloning did require that much audio โ in 2018. Things are changing faster than people can keep up.