Note
Hello, welcome to the SunFounder Raspberry Pi & Arduino & ESP32 Enthusiasts Community on Facebook! Dive deeper into Raspberry Pi, Arduino, and ESP32 with fellow enthusiasts.
Why Join?
Expert Support: Solve post-sale issues and technical challenges with help from our community and team.
Learn & Share: Exchange tips and tutorials to enhance your skills.
Exclusive Previews: Get early access to new product announcements and sneak peeks.
Special Discounts: Enjoy exclusive discounts on our newest products.
Festive Promotions and Giveaways: Take part in giveaways and holiday promotions.
π Ready to explore and create with us? Click [here] and join today!
3.4 Funsion HAT Microphoneο
Introduction
The Fusion HAT+ includes a built-in microphone, making it ideal for audio input applications such as voice recognition, sound detection, or recording logs in AI/IoT projects.
This guide will help you check if the microphone is recognized by the system and show you how to perform a basic recording test.
What Youβll Need
Below are the components required for this tutorial:
Component |
Purchase Link |
|---|---|
- |
|
Raspberry Pi (or compatible model) |
- |
Run the programο
cd ~/ai-lab-kit/llm
sudo python3 stt_vosk_stream.py
The first time you run this code with a new language, Vosk will:
Automatically download the language model (by default, the small version).
Print out the list of supported languages.
Start listening for audio input through the microphone.
Youβll see something like this in the terminal:
vosk-model-small-en-us-0.15.zip: 100%|βββββββββββββββββββ| 39.3M/39.3M [00:05<00:00, 7.85MB/s]
['ar', 'ar-tn', 'ca', 'cn', 'cs', 'de', 'en-gb', 'en-in', 'en-us', 'eo', 'es', 'fa', 'fr', 'gu', 'hi', 'it', 'ja', 'ko', 'kz', 'nl', 'pl', 'pt', 'ru', 'sv', 'te', 'tg', 'tr', 'ua', 'uz', 'vn']
Say something
This means:
The model file (
vosk-model-small-en-us-0.15) has been downloaded.The list of supported languages has been printed.
The system is now listening β say something into the Fusion HAT+ microphone, and the recognized text will appear in the terminal.
Tips:
Keep the microphone about 15β30 cm away for better accuracy.
Choose a model that matches your language and accent.
Use a quiet environment to improve recognition.
Codeο
from fusion_hat.stt import Vosk as STT
stt = STT(language="en-us")
while True:
print("Say something")
for result in stt.listen(stream=True):
if result["done"]:
print(f"final: {result['final']}")
else:
print(f"partial: {result['partial']}", end="\r", flush=True)
Code explanation:
stt.listen(stream=True)β Starts streaming speech recognition and yields intermediate results as you speak.result["partial"]β Displays the real-time recognized text (updated continuously).result["final"]β Displays the final recognized sentence when you stop speaking.The loop runs continuously, allowing hands-free real-time transcription.
Tip: This streaming mode is perfect for voice assistants, command control, or live transcription.
Troubleshootingο
No such file or directory (when running `arecord`)
You may have used the wrong card/device number. Run:
arecord -land replace
1,0with the numbers shown for your USB microphone.Vosk does not recognize speech
Make sure the language code matches your model (e.g.
en-usfor English,zh-cnfor Chinese).Keep the microphone 15β30 cm away and avoid background noise.
Speak clearly and slowly.
High latency / slow recognition
The default auto-download is a small model (faster, but less accurate).
If itβs still slow, close other programs to free CPU.