Wake word detection entity
A wake word detection entity allows other integrations or applications to detect wake words (also called hotwords) in an audio stream.
A wake word detection entity is derived from the homeassistant.components.wake_word.WakeWordDetectionEntity
.
Properties
Properties should always only return information from memory and not do I/O (like network requests).
Name | Type | Default | Description |
---|---|---|---|
supported_wake_words | list[WakeWord] | Required | The supported wake words of the service with:
|
Methods
Process audio stream
The process audio stream method is used to detect wake words. It must return a DetectionResult
or None
if the audio stream ends without a detection.
class MyWakeWordDetectionEntity(WakeWordDetectionEntity):
"""Represent a Wake Word Detection entity."""
async def async_process_audio_stream(
self, stream: AsyncIterable[tuple[bytes, int]]
) -> DetectionResult | None:
"""Try to detect wake word(s) in an audio stream with timestamps.
Audio must be 16Khz sample rate with 16-bit mono PCM samples.
"""
The audio stream is made of tuples with the form (timestamp, audio_chunk)
where:
timestamp
is the number of milliseconds since the start of the audio streamaudio_chunk
is a chunk of 16-bit signed mono PCM samples at 16Khz
If a wake word is detected, a DetectionResult
is returned with:
ww_id
- the unique identifier of the detected wake wordtimestamp
- the timestamp of the audio chunk when detection occurredqueued_audio
- optional audio chunks that will be forwarded to speech-to-text (see below)
In an Assist pipeline, the audio stream is shared between wake word detection and speech-to-text. This means that any audio chunk removed during wake word detection can not be processed by speech-to-text unless passed back in the queued_audio
of a DetectionResult
.