VMware
This #raspberrypi is always listening… #openai
This #raspberrypi is always listening… #openai
#raspberrypi #listening.. #openai
“abe’s projects”
In this video I use OpenAI’s Whisper model on a Raspberry Pi to transcode audio from a conference room mic. It works pretty well, but it’s far too slow for a lot of applications. With some creative hacking you may be able to make it useful, I don’t know!
source
To see the full content, share this page by clicking one of the buttons below |
Hi sir can I have your raspberry pi 😅
Using the google trigger word in your script is a no no I Won't be back
That’s a pi 4B
Low voltage detected
The hard part is trying to do realtime inferencing with Whisper. It takes 30-second chunks, so you need to detect mic input, read it, and then when the noise has fully stopped process all of it. This gets complicated because someone may be talking near it but not at it.
I researched for weeks and couldn't find any method to have it just idly listening for a wake word. As far as I can tell there's no easy way for a consumer to do such a thing.
Amazing brother
Sneaky lol. Hey Google
My smartphone from 8 years ago could do real time offline speech to text so I'm surprised it's as slow as that. What model of pi is it?
You can use Cuda cores for this task, I tested it with a rtx 3050 mobile and the speech to text task is so close to real-time!!. In this moment I am doing some tests using a jetson nano
Keeping it offline is actually better, for security reasons
India se ham ❤❤
Warching these videos makes me wish i didnt have ADHD. I cant focu on much but enjoy seeing it!
When you said "Hey Google", the Google assistant on my phone activated and thought it was being talked to. Even video audio can trigger Google assistant.
What about usong an intel ncs2 usb ai accelerator stick?
Forced my phone to google the words "and then". 😂
Theres also faster-whisper which is used as part of homeassistants local voice assistant stack, maybe give that a try
Stop activating my Google assistant lol
Undervoltage detected lmao
Personally, I just use the API key for whisper. It's very very fast and super accurate
Also in order to get any sort of good accuracy (better than Google's and iPhones) you must specify the language to English if you leave it on automatic, it's worse
If it takes so much time to transcode, how much can you speak before the storage is just full ?
Activated Google on my phone when you said hey google lmao
bro didnt even trigger warning box demon for us