XTTS

Generate long vocal from text in several languages following voice freely, without account, without watermark and download it

XTTS is a Voice generation model that lets you clone voices into different languages by using just a quick 3-second audio clip.
This is the same model that powers our creator application Coqui Studio as well as the Coqui API. In production we apply modifications to make low-latency streaming possible.
Leave a star on the Github TTS, where our open-source inference and training code lives.
Beware of the pronunciation of the dot at the end of the sentences, especially when the last word is unknown. That's the main problem of this AI.

To avoid the queue, you can duplicate this space on CPU, GPU or ZERO space GPU:
Duplicate Space

Language

Select an output language for the synthesised speech

More languages here
1 9
Gender

Gender of the voice (ignored with a reference audio)

Notice: Microphone input may not work properly under traffic

0 10

If checked, result is always different

0 9223372036854776000

XTTS on your computer

You can install Pinokio locally and then install XTTS into it. It should be quite easy if you have an Nvidia GPU. You can also install XTTS on your computer using docker but it's more complicate.

Examples
Text Prompt Language Gender