fusion_hat.stt module

Speech to Text module

Convert speech to text.

Example

import STT and instantiate it

>>> from fusion_hat.stt import STT
>>> stt = STT(language="en-us")

Listen for speech input

>>> result = stt.listen(stream=False)
>>> print(result)
Hello

Use Stream to get partial results

>>> for result in stt.listen(stream=True):
>>>     if result["done"]:
>>>         print(f"\r\x1b[Kfinal: {result['final']}")
>>>     else:
>>>         print(f"\r\x1b[Kpartial: {result['partial']}", end="", flush=True)

Wait for wake words

>>> WAKE_WORDS = ["hey robot", "hello robot"]
>>> stt = STT(language="en-us")
>>> stt.set_wake_words(WAKE_WORDS)
>>> print(f'Wake me with: {WAKE_WORDS}')
Wake me with: ['hey robot', 'hello robot']
>>> result = stt.wait_until_heard()
>>> print("Wake word detected")
Wake word detected

Wake word in thread

>>> while True:
>>>     stt.start_listening_wake_words()
>>>     while not stt.is_waked():
>>>         print("Waiting for wake word...")
>>>         time.sleep(3)
>>>     print("Wake word detected")
fusion_hat.stt.STT

alias of Vosk

class fusion_hat.stt.Vosk(language=None, samplerate=None, device=None, log=None)[source]

Bases: object

Vosk STT class

DEFAULT_LANGUAGE = 'en-us'
is_ready()[source]

Check if Vosk STT is ready

Returns:

True if ready, False otherwise

Return type:

bool

init()[source]

Initialize Vosk STT

_load_model_list()[source]

Load model list from local cache or built-in defaults (offline, no network).

update_model_list()[source]

Fetch latest model list from network and save to cache.

Call this manually when you want to check for new models online. Falls back to local cache if network is unavailable.

wait_until_heard(wake_words=None, print_callback=<function Vosk.<lambda>>)[source]

Wait until heard a wake word

Parameters:
  • wake_words (list, optional) – Wake words, default is None

  • print_callback (function, optional) – Print callback, default is None

Returns:

Heard wake word

Return type:

str

heard_wake_word(print_callback=<function Vosk.<lambda>>)[source]

Check if heard a wake word

Parameters:

print_callback (function, optional) – Print callback, default is None

Returns:

True if heard a wake word, False otherwise

Return type:

bool

wait_for_wake_word()[source]

Wait for wake word

start_listening_wake_words()[source]

Start listening for wake words

is_waked()[source]

Check if the wake word thread is running

Returns:

True if running, False otherwise

Return type:

bool

stt(filename, stream=False)[source]

Perform STT on audio file

Parameters:
  • filename (str) – Audio file path

  • stream (bool, optional) – Stream mode, default is False

Returns:

STT result

Return type:

str

get_stream_result(wf, recognizer)[source]

Get streaming results from recognizer

Parameters:
  • wf (wave.Wave_read) – Wave file object

  • recognizer (KaldiRecognizer) – Vosk recognizer

Yields:

str – STT result

listen(stream=False, device=None, samplerate=None)[source]

Listen from microphone and return results

Parameters:
  • stream (bool, optional) – Stream mode, default is False

  • device (int, optional) – Device index, default is None

  • samplerate (int, optional) – Sampling rate, default is None

Returns:

STT result

Return type:

str

_listen_streaming(q, device=None, samplerate=None, callback=None)[source]

Listen from microphone and return streaming results

Parameters:
  • q (queue.Queue) – Queue to store audio data

  • device (int, optional) – Device index, default is None

  • samplerate (int, optional) – Sampling rate, default is None

  • callback (function, optional) – Callback function, default is None

Yields:

dict – STT result

_listen_non_streaming(q, device=None, samplerate=None, callback=None)[source]

Listen from microphone and return final result

Parameters:
  • q (queue.Queue) – Queue to store audio data

  • device (int, optional) – Device index, default is None

  • samplerate (int, optional) – Sampling rate, default is None

  • callback (function, optional) – Callback function, default is None

Returns:

STT result

Return type:

str

set_wake_words(wake_words: list)[source]

Set wake words

Parameters:

wake_words (list) – List of wake words

language() str[source]

Get current language

Returns:

Current language

Return type:

str

set_language(language: str, init=True)[source]

Set language

Parameters:
  • language (str) – Language to set

  • init (bool, optional) – Initialize recognizer, default is True

get_model_name(lang: str) str[source]

Get model name for language

Parameters:

lang (str) – Language

Returns:

Model name

Return type:

str

get_model_path(lang: str) Path[source]

Get model path for language

Parameters:

lang (str) – Language

Returns:

Model path

Return type:

Path

is_model_downloaded(lang: str) bool[source]

Check if model is downloaded

Parameters:

lang (str) – Language

Returns:

True if model is downloaded, False otherwise

Return type:

bool

cancel_download()[source]

Public method to cancel ongoing download

download_model(lang: str, progress_callback=None, max_retries: int = 5)[source]

Download model for language

Parameters:
  • lang (str) – Language

  • progress_callback (function, optional) – Progress callback function, default is None

  • max_retries (int, optional) – Maximum retries, default is 5

download_progress_hook(tqdm_bar=None, progress_callback=None)[source]

Download progress hook function

Parameters:
  • tqdm_bar (tqdm, optional) – tqdm progress bar, default is None

  • progress_callback (function, optional) – Progress callback function, default is None

stop_listening()[source]

Stop listening for wake word

close()[source]

Close STT