fusion_hat.stt module
Speech to Text module
Convert speech to text.
Example
import STT and instantiate it
>>> from fusion_hat.stt import STT
>>> stt = STT(language="en-us")
Listen for speech input
>>> result = stt.listen(stream=False)
>>> print(result)
Hello
Use Stream to get partial results
>>> for result in stt.listen(stream=True):
>>> if result["done"]:
>>> print(f"\r\x1b[Kfinal: {result['final']}")
>>> else:
>>> print(f"\r\x1b[Kpartial: {result['partial']}", end="", flush=True)
Wait for wake words
>>> WAKE_WORDS = ["hey robot", "hello robot"]
>>> stt = STT(language="en-us")
>>> stt.set_wake_words(WAKE_WORDS)
>>> print(f'Wake me with: {WAKE_WORDS}')
Wake me with: ['hey robot', 'hello robot']
>>> result = stt.wait_until_heard()
>>> print("Wake word detected")
Wake word detected
Wake word in thread
>>> while True:
>>> stt.start_listening_wake_words()
>>> while not stt.is_waked():
>>> print("Waiting for wake word...")
>>> time.sleep(3)
>>> print("Wake word detected")
- class fusion_hat.stt.Vosk(language=None, samplerate=None, device=None, log=None)[source]
Bases:
objectVosk STT class
- DEFAULT_LANGUAGE = 'en-us'
- is_ready()[source]
Check if Vosk STT is ready
- Returns:
True if ready, False otherwise
- Return type:
bool
- _load_model_list()[source]
Load model list from local cache or built-in defaults (offline, no network).
- update_model_list()[source]
Fetch latest model list from network and save to cache.
Call this manually when you want to check for new models online. Falls back to local cache if network is unavailable.
- wait_until_heard(wake_words=None, print_callback=<function Vosk.<lambda>>)[source]
Wait until heard a wake word
- Parameters:
wake_words (list, optional) – Wake words, default is None
print_callback (function, optional) – Print callback, default is None
- Returns:
Heard wake word
- Return type:
str
- heard_wake_word(print_callback=<function Vosk.<lambda>>)[source]
Check if heard a wake word
- Parameters:
print_callback (function, optional) – Print callback, default is None
- Returns:
True if heard a wake word, False otherwise
- Return type:
bool
- is_waked()[source]
Check if the wake word thread is running
- Returns:
True if running, False otherwise
- Return type:
bool
- stt(filename, stream=False)[source]
Perform STT on audio file
- Parameters:
filename (str) – Audio file path
stream (bool, optional) – Stream mode, default is False
- Returns:
STT result
- Return type:
str
- get_stream_result(wf, recognizer)[source]
Get streaming results from recognizer
- Parameters:
wf (wave.Wave_read) – Wave file object
recognizer (KaldiRecognizer) – Vosk recognizer
- Yields:
str – STT result
- listen(stream=False, device=None, samplerate=None)[source]
Listen from microphone and return results
- Parameters:
stream (bool, optional) – Stream mode, default is False
device (int, optional) – Device index, default is None
samplerate (int, optional) – Sampling rate, default is None
- Returns:
STT result
- Return type:
str
- _listen_streaming(q, device=None, samplerate=None, callback=None)[source]
Listen from microphone and return streaming results
- Parameters:
q (queue.Queue) – Queue to store audio data
device (int, optional) – Device index, default is None
samplerate (int, optional) – Sampling rate, default is None
callback (function, optional) – Callback function, default is None
- Yields:
dict – STT result
- _listen_non_streaming(q, device=None, samplerate=None, callback=None)[source]
Listen from microphone and return final result
- Parameters:
q (queue.Queue) – Queue to store audio data
device (int, optional) – Device index, default is None
samplerate (int, optional) – Sampling rate, default is None
callback (function, optional) – Callback function, default is None
- Returns:
STT result
- Return type:
str
- set_wake_words(wake_words: list)[source]
Set wake words
- Parameters:
wake_words (list) – List of wake words
- set_language(language: str, init=True)[source]
Set language
- Parameters:
language (str) – Language to set
init (bool, optional) – Initialize recognizer, default is True
- get_model_name(lang: str) str[source]
Get model name for language
- Parameters:
lang (str) – Language
- Returns:
Model name
- Return type:
str
- get_model_path(lang: str) Path[source]
Get model path for language
- Parameters:
lang (str) – Language
- Returns:
Model path
- Return type:
Path
- is_model_downloaded(lang: str) bool[source]
Check if model is downloaded
- Parameters:
lang (str) – Language
- Returns:
True if model is downloaded, False otherwise
- Return type:
bool
- download_model(lang: str, progress_callback=None, max_retries: int = 5)[source]
Download model for language
- Parameters:
lang (str) – Language
progress_callback (function, optional) – Progress callback function, default is None
max_retries (int, optional) – Maximum retries, default is 5