No Silero VAD videos yet. You could help us improve this page by suggesting one.
Resemble AI might be a bit more popular than Silero VAD. We know about 7 links to it since March 2021 and only 5 links to Silero VAD. We are tracking product recommendations and mentions on various public social media platforms and blogs. They can help you identify which product is more popular and what people think of it.
I've heard pretty good reviews about resemble.ai for voices, since it uses real people's voices. I've never personally used it though. Source: about 1 year ago
Oh yeah! This really cries for AI usage! Something like synthesia.io and resemble.ai should be in the loop! Source: about 1 year ago
Just out of curiosity, if one made a voice from resemble.ai, and got the api token, could one replace the openai in your shortcut with resembleai and have it use that voice? Source: over 1 year ago
I like resemble(dot)ai but their interface is very slow, maybe someone knows some good TTS APIs? Source: over 1 year ago
This is what I showed in the video: https://github.com/CorentinJ/Real-Time-Voice-Cloning. I've tried it myself, and it's alright, but it's not perfect. Many commercial programs are much better, but they cost a decent amount. Usually, it's either a subscription model or price per word. I've heard descript's overdub is very good. I think resemble.ai is supposed to be decent as well. Source: over 1 year ago
>How do you detect speech starting and stopping? https://github.com/snakers4/silero-vad. - Source: Hacker News / 7 months ago
You could look into https://github.com/guillaumekln/faster-whisper especially the VAD section (Voice Activity Detector) using https://github.com/snakers4/silero-vad. Source: 11 months ago
I also had the same synchronization issue, so I wrote a WebUI/CLI that uses Silero-VAD that first splits the audio whenever there a silent portion (or every 30 seconds), and I haven't experienced it since:. Source: about 1 year ago
By the way, I've updated the WebUI to now also support using Silero VAD to break up the audio into distinct sections, and run Whisper on each section and then combine them into one single transcript/SRT file. Source: over 1 year ago
And while googling this, I stumbled upon this discussion on the Whisper GitHub repository, which seems to suggest that the issue is that the current VAD (Voice Activity Detection) is quite poor, and that it can be resolved by using another VAD (like silero-vad). This might be something I want to add to my WebUI in the future. Source: over 1 year ago
Descript - Text-based audio editor and automated transcription
The Parodist App - Super-realistic celebs' voices made by AI
Lovo.ai - AI Voice Creation Platform for marketing, HR, audiobook, e-learning, movies and games.
Whisper Memos - Whisper Memos turns your ramblings into paragraphed articles, and emails them to you.
Replica - Simple way for save articles, stories and web pages for reading: offline, organized and clean...
NaturalReader - Main Feature: Full Common Functions: Read Text Files o Text files o MS Word files