lid
Values: false
, true
, language, language_model
Description:
The lid
parameter enables you to use the ASR engine's Language Identification (LID) module to identify the language spoken in the input audio and automatically use an appropriate language model. To force the use of an an alternate model with a different "domain" name, specify it using lid
=language_model.
-
lid
=true
- automatically selects the language identification model based on the LID and language models that are available. -
lid
=language - the alternative language to detect. The primary language is determined from the primary model. In this case an alternative language model of the specified language is automatically selected. -
lid
=language_model - the alternate language model. -
lid=language_model, language_model
- for dual-channel audio, this specifies the alternate language models to use for channel 0 and channel 1 respectively. -
lid=LANG:notext
- use this to not decode text from speech when the language LANG is detected. If this option is specified and LANG is detected, then utterances in the request's JSON transcript output do not contain any word events or metadata models. Specify only the base language model when using this option; do not include the region or domain. For example:lid=spa:notext
is valid, butlid=eng-us:callcenter:notext
is not. -
lid
=language:info
- use this to decode all audio using the primary model, but provide language identification information in the transcript. -
lid
=false
- lid is not used.
The following parameters provide additional options when using the lid
tag:
Name |
Values |
Description |
---|---|---|
|
integer (default is 20) |
Specifies the maximum audio duration (seconds) to analyze. For example, if |
|
float between 0 and 1 (default is 0) |
Adjusts the confidence level required for the system to select the alternative language. Setting this option to values greater than zero will increase preference for the default model. |
|
integer (default is 0.7) |
Specifies the required confidence level before lid will stop analyzing audio. Defaults to 0.7. |
|
integer |
Delay start of LID until specified (N) seconds into audio. If there is not enough audio left after offset, this will process preceding utterances in reverse.
|
|
float between 0 and 1 (default is 0.5) |
Defines the prior probability distribution of the alternative |
|
true, false |
Run LID on every utterance. The default is only once per stream or audio channel. This option is only available with V‑Blaze 7.1+. In V‑Blaze version 7.2+, the Note: This option has a significant performance impact and should only be used when necessary.
|
lidthreshold
or goes over the audio duration limit set in lidmaxtime
.When LID scoring is below the decision threshold, the ASR engine will transcribe the audio with the language model specified by the model tag (or the default model for the ASR configuration if model
is not explicitly provided). The results are indicated by a lidinfo.langfinal
element in the JSON output.
Language identification is a licensed optional feature.
For additional information about using the lid
tag, see: