Pocket TTS delivers high-quality text-to-speech on standard CPUs. No GPU, no cloud APIs. It is the first local TTS with voice ...
Mistral AI has launched Voxtral Transcribe 2, a new on-device speech-to-text model family featuring real-time transcription, ...
Too many GPUs makes you lazy,” says the French startup's vice president of science operations, as the company carves out a ...
Silicon Valley startups and tech giants are pushing voice-based AI dictation as faster than typing, with developers dictating hundreds of thousands of words monthly. Free and paid apps from Wispr Flow ...
Nanospeech is a research-oriented project to build a minimal, easy to understand text-to-speech system that scales to any level of compute. It supports voice matching from a reference speech sample, ...
Abstract: We propose a lightweight end-to-end text-to-speech model using multi-band generation and inverse short-time Fourier transform. Our model is based on VITS, a high-quality end-to-end ...
VoiceCraft is a token infilling neural codec language model, that achieves state-of-the-art performance on both speech editing and zero-shot text-to-speech (TTS) on in-the-wild data including ...
Abstract: Automatic Speech Recognition (ASR) technologies can be life-changing for individuals who suffer from dysarthria, a speech impairment that affects articulatory muscles and results in ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results