Play.ht offers some of the best AI voices to help you create realistic AI voiceovers for your videos, presentations, education and other projects. Play.ht's state-of-the-art Text to Speech editor allows you to create the voiceover according to your needs. You can use multiple AI voices to create conversation-like audio and use full SSML features to enhance your audio.
Play.ht also allows you to embed and distribute your audio files. You can embed the audio using our audio player widgets to increase accessibility on your articles or web-pages. You can use our Podcasting solution to distribute your audio files as podcasts to iTunes and Spotify.
Try Play.ht for free - https://play.ht/
No features have been listed yet.
No Silero VAD videos yet. You could help us improve this page by suggesting one.
Based on our record, Play.ht seems to be a lot more popular than Silero VAD. While we know about 64 links to Play.ht, we've tracked only 5 mentions of Silero VAD. We are tracking product recommendations and mentions on various public social media platforms and blogs. They can help you identify which product is more popular and what people think of it.
>How do you detect speech starting and stopping? https://github.com/snakers4/silero-vad. - Source: Hacker News / 7 months ago
You could look into https://github.com/guillaumekln/faster-whisper especially the VAD section (Voice Activity Detector) using https://github.com/snakers4/silero-vad. Source: 11 months ago
I also had the same synchronization issue, so I wrote a WebUI/CLI that uses Silero-VAD that first splits the audio whenever there a silent portion (or every 30 seconds), and I haven't experienced it since:. Source: about 1 year ago
By the way, I've updated the WebUI to now also support using Silero VAD to break up the audio into distinct sections, and run Whisper on each section and then combine them into one single transcript/SRT file. Source: over 1 year ago
And while googling this, I stumbled upon this discussion on the Whisper GitHub repository, which seems to suggest that the issue is that the current VAD (Voice Activity Detection) is quite poor, and that it can be resolved by using another VAD (like silero-vad). This might be something I want to add to my WebUI in the future. Source: over 1 year ago
There aren't really any models that produce realistic real-time voice. I'd recommend ElevenLabs or play.ht, sadly these seem to be the only useable options for now. Source: 6 months ago
I've used play.ht before. Very easy to use. Source: 11 months ago
Does anyone know what they are using and if its possible to get it and run it locally? I have a lot of text to voice (1 500 000 characters, 300 000 words) so using services as elevenlabs or play.ht would be pretty expensive. The quality is secondary to it being reasonably fast (got a 2060 super, dont want to run it for 4 months straight to generate all this dialogue). Source: 11 months ago
My experience with play.ht wasn't positive, had way better luck paying the eleven labs premium. Source: 11 months ago
(The biggest problem I have with play.ht is it won't do some things because "Your content violates our standards" and that is for "fight scenes" written over 100 years ago). Source: 11 months ago
The Parodist App - Super-realistic celebs' voices made by AI
Blogcast - Turn your articles into audio
Whisper Memos - Whisper Memos turns your ramblings into paragraphed articles, and emails them to you.
BeyondWords - BeyondWords is an AI voice and audio CMS platform that brings frictionless audio publishing to writers, newsrooms, and businesses. Free Pilot plan available!
Replica - Simple way for save articles, stories and web pages for reading: offline, organized and clean...
Pocket Listen - Reading is hard. Listen to articles instead.