Analysis3 min read

Voice AI Race Heats Up: Suno 5.5 Clones Your Voice, Mistral Goes Open-Source TTS

Suno lets you train songs on your own voice, Mistral's Voxtral runs locally with 3-second voice cloning, and Smallest.ai outscores ElevenLabs on quality benchmarks.

E
Editorial
Mar 28, 2026

The voice and music AI space saw three major launches in a single week.

Suno 5.5, released March 26, introduces voice cloning for Pro and Premier subscribers. You record a live voice sample matched to a random phrase for verification, and Suno generates songs using your voice. It also adds Custom Models — upload your original tracks and Suno tunes v5.5 to your musical style (up to 3 models). A new 'My Taste' feature learns your genre and mood preferences over time.

Mistral released Voxtral TTS on March 26 — an open-source, 4-billion-parameter text-to-speech model that runs on consumer hardware including laptops and some mobile devices. It achieves 90ms time-to-first-audio and 6x real-time factor. The standout feature: voice cloning from as little as 3 seconds of audio, capturing accent, inflections, and natural fillers like 'ums' and 'ahs.' It supports 9 languages and competes directly with ElevenLabs and Deepgram.

Smallest.ai launched Lightning V3 on March 27, scoring 3.89 MOS (Mean Opinion Score) — outperforming OpenAI, Cartesia, and ElevenLabs on quality benchmarks. It supports 15 languages, offers voice cloning in under 10 seconds, and is optimized for conversational voice agents with natural-sounding pauses and disfluencies. Pricing starts at roughly $0.25 per 10K characters.

The common thread: voice AI is becoming commoditized. Open-source models can now match or beat proprietary ones, voice cloning requires seconds of audio, and the price keeps dropping.

E
Editorial
Mar 28, 2026 · 3 min read
Back to News