Generate multilingual speech, music, and sound effects with a text-to-audio model. Ideal for creative projects.

Bark is a cutting-edge, transformer-based text-to-audio model that generates realistic, multilingual speech and audio. Unlike traditional text-to-speech models, Bark can produce a wide range of sounds, including music, background noise, and nonverbal cues like laughter or sighs. Pretrained model checkpoints are available for inference and commercial use, making it accessible for developers and researchers alike.
Key features include:
Bark is open-source and licensed under the MIT License, encouraging community collaboration and innovation. Perfect for creative projects, research, and more.