Note

Hello, welcome to the SunFounder Raspberry Pi & Arduino & ESP32 Enthusiasts Community on Facebook! Dive deeper into Raspberry Pi, Arduino, and ESP32 with fellow enthusiasts.

Why Join?

  • Expert Support: Solve post-sale issues and technical challenges with help from our community and team.

  • Learn & Share: Exchange tips and tutorials to enhance your skills.

  • Exclusive Previews: Get early access to new product announcements and sneak peeks.

  • Special Discounts: Enjoy exclusive discounts on our newest products.

  • Festive Promotions and Giveaways: Take part in giveaways and holiday promotions.

πŸ‘‰ Ready to explore and create with us? Click [here] and join today!

3.4 Funsion HAT Microphone

Introduction

The Fusion HAT+ includes a built-in microphone, making it ideal for audio input applications such as voice recognition, sound detection, or recording logs in AI/IoT projects.

This guide will help you check if the microphone is recognized by the system and show you how to perform a basic recording test.

../_images/fusionhat_mic.png

What You’ll Need

Below are the components required for this tutorial:

Component

Purchase Link

Fusion HAT+

-

Raspberry Pi (or compatible model)

-

Run the program

cd ~/ai-lab-kit/llm
sudo python3 stt_vosk_stream.py

The first time you run this code with a new language, Vosk will:

  • Automatically download the language model (by default, the small version).

  • Print out the list of supported languages.

  • Start listening for audio input through the microphone.

You’ll see something like this in the terminal:

vosk-model-small-en-us-0.15.zip: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 39.3M/39.3M [00:05<00:00, 7.85MB/s]
['ar', 'ar-tn', 'ca', 'cn', 'cs', 'de', 'en-gb', 'en-in', 'en-us', 'eo', 'es', 'fa', 'fr', 'gu', 'hi', 'it', 'ja', 'ko', 'kz', 'nl', 'pl', 'pt', 'ru', 'sv', 'te', 'tg', 'tr', 'ua', 'uz', 'vn']
Say something

This means:

  • The model file (vosk-model-small-en-us-0.15) has been downloaded.

  • The list of supported languages has been printed.

  • The system is now listening β€” say something into the Fusion HAT+ microphone, and the recognized text will appear in the terminal.

Tips:

  • Keep the microphone about 15–30 cm away for better accuracy.

  • Choose a model that matches your language and accent.

  • Use a quiet environment to improve recognition.

Code

from fusion_hat.stt import Vosk as STT

stt = STT(language="en-us")

while True:
   print("Say something")
   for result in stt.listen(stream=True):
      if result["done"]:
            print(f"final:   {result['final']}")
      else:
            print(f"partial: {result['partial']}", end="\r", flush=True)

Code explanation:

  • stt.listen(stream=True) β€” Starts streaming speech recognition and yields intermediate results as you speak.

  • result["partial"] β€” Displays the real-time recognized text (updated continuously).

  • result["final"] β€” Displays the final recognized sentence when you stop speaking.

  • The loop runs continuously, allowing hands-free real-time transcription.

Tip: This streaming mode is perfect for voice assistants, command control, or live transcription.

Troubleshooting

  • No such file or directory (when running `arecord`)

    You may have used the wrong card/device number. Run:

    arecord -l
    

    and replace 1,0 with the numbers shown for your USB microphone.

  • Vosk does not recognize speech

    • Make sure the language code matches your model (e.g. en-us for English, zh-cn for Chinese).

    • Keep the microphone 15–30 cm away and avoid background noise.

    • Speak clearly and slowly.

  • High latency / slow recognition

    • The default auto-download is a small model (faster, but less accurate).

    • If it’s still slow, close other programs to free CPU.