Conversation API
Intents can be recognized from text and fired using the conversation integration.
An API endpoint is available which receives an input sentence and produces an conversation response. A "conversation" is tracked across multiple inputs and responses by passing a conversation id generated by Home Assistant.
The API is available via the Rest API and Websocket API.
A sentence may be POST-ed to /api/conversation/process
like:
{
"text": "turn on the lights in the living room",
"language": "en"
}
Or sent via the WebSocket API like:
{
"type": "conversation/process",
"text": "turn on the lights in the living room",
"language": "en"
}
The following input fields are available:
Name | Type | Description |
---|---|---|
text | string | Input sentence. |
language | string | Optional. Language of the input sentence (defaults to configured language). |
conversation_id | string | Optional. Unique id to track conversation. Generated by Home Assistant. |
Conversation response
The JSON response from /api/conversation/process
contains information about the effect of the fired intent, for example:
{
"response": {
"response_type": "action_done",
"language": "en",
"data": {
"targets": [
{
"type": "area",
"name": "Living Room",
"id": "living_room"
},
{
"type": "domain",
"name": "light",
"id": "light"
}
],
"success": [
{
"type": "entity",
"name": "My Light",
"id": "light.my_light"
}
],
"failed": [],
},
"speech": {
"plain": {
"speech": "Turned Living Room lights on"
}
}
},
"conversation_id": "<generated-id-from-ha>",
}
The following properties are available in the "response"
object:
Name | Type | Description |
---|---|---|
response_type | string | One of action_done , query_answer , or error (see response types). |
data | dictionary | Relevant data for each response type. |
language | string | The language of the intent and response. |
speech | dictionary | Optional. Response text to speak to the user (see speech). |
The conversation id is returned alongside the conversation response.
Response types
Action done
The intent produced an action in Home Assistant, such as turning on a light. The data
property of the response contains a targets
list, where each target looks like:
Name | Type | Description |
---|---|---|
type | string | Target type. One of area , domain , device_class , device , entity , or custom . |
name | string | Name of the affected target. |
id | string | Optional. Id of the target. |
Two additional target lists are included, containing the devices or entities that were a success
or failed
:
{
"response": {
"response_type": "action_done",
"data": {
"targets": [
(area or domain)
],
"success": [
(entities/devices that succeeded)
],
"failed": [
(entities/devices that failed)
]
}
}
}
An intent can have multiple targets which are applied on top of each other. The targets must be ordered from general to specific:
area
domain
- Home Assistant integration domain, such as "light"
device_class
- Device class for a domain, such as "garage_door" for the "cover" domain
device
entity
custom
- A custom target
Most intents end up with 0, 1 or 2 targets. 3 targets currenly only happens when device classes are involved. Examples of target combinations:
- "Turn off all lights"
- 1 target:
domain:light
- 1 target:
- "Turn on the kitchen lights"
- 2 targets:
area:kitchen
,domain:light
- 2 targets:
- "Open the kitchen blinds"
- 3 targets:
area:kitchen
,domain:cover
,device_class:blind
- 3 targets:
Query answer
The response is an answer to a question, such as "what is the temperature?". See the speech property for the answer text.
{
"response": {
"response_type": "query_answer",
"language": "en",
"speech": {
"plain": {
"speech": "It is 65 degrees"
}
},
"data": {
"targets": [
{
"type": "domain",
"name": "climate",
"id": "climate"
}
],
"success": [
{
"type": "entity",
"name": "Ecobee",
"id": "climate.ecobee"
}
],
"failed": [],
}
},
"conversation_id": "<generated-id-from-ha>",
}
Error
An error occurred either during intent recognition or handling. See data.code
for the specific type of error, and the speech property for the error message.
{
"response": {
"response_type": "error",
"language": "en",
"data": {
"code": "no_intent_match"
},
"speech": {
"plain": {
"speech": "Sorry, I didn't understand that"
}
}
}
}
data.code
is a string that can be one of:
no_intent_match
- The input text did not match any intents.no_valid_targets
- The targeted area, device, or entity does not exist.failed_to_handle
- An unexpected error occurred while handling the intent.unknown
- An error occurred outside the scope of intent processing.
Speech
The spoken response to the user is provided in the speech
property of the response. It can either be plain text (the default), or SSML.
For plain text speech, the response will look like:
{
"response": {
"response_type": "...",
"speech": {
"plain": {
"speech": "...",
"extra_data": null
}
}
},
"conversation_id": "<generated-id-from-ha>",
}
If the speech is SSML, it will instead be:
{
"response": {
"response_type": "...",
"speech": {
"ssml": {
"speech": "...",
"extra_data": null
}
}
},
"conversation_id": "<generated-id-from-ha>",
}
Conversation Id
Conversations can be tracked by a unique id generated from within Home Assistant if supported by the answering conversation agent. To continue a conversation, retrieve the conversation_id
from the HTTP API response (alongside the conversation response) and add it to the next input sentence:
Initial input sentence:
{
"text": "Initial input sentence."
}
JSON response contains conversation id:
{
"conversation_id": "<generated-id-from-ha>",
"response": {
(conversation response)
}
}
POST with the next input sentence:
{
"text": "Related input sentence.",
"conversation_id": "<generated-id-from-ha>"
}
Pre-loading sentences
Sentences for a language can be pre-loaded using the WebSocket API:
{
"type": "conversation/prepare",
"language": "en"
}
The following input fields are available:
Name | Type | Description |
---|---|---|
language | string | Optional. Language of the sentences to load (defaults to configured language). |