Out-of-vocabulary (OOV)
OOV (out-of-vocabulary) is an ASR tuning feature designed to improve transcription accuracy for audio that contains brand- and industry-specific terminology. OOV enhances existing language models with new words and preferential treatment for those words.
Learn more:
Name |
Value |
Description |
---|---|---|
oov |
JSON object literal |
Defines the vocabulary to be added to the request's language model. There are two components to the data included with OOV requests: the vocabulary, which consists of the words and phrases that make up the OOV terms; and the dictionary, which maps non-standard terms to sounds (referred to as sound-outs). Sound-outs are optional. If sound-outs are not supplied with the request, the ASR engine will use its standard interpretations of words, and in many cases this is sufficient. Learn more: Vocabulary development |
OOV JSON syntax
OOV uses a dictionary of terms and their pronunciations. The dictionary is submitted with each request that uses OOV. The dictionary must include the vocab
key for OOV terms or contextual phrases that include those terms, and may include the dict
key for mapping OOV to their approximate pronunciations (sound-outs). Sound-outs are optional but should improve performance for made-up words, or when the relationship between a word's spelling and its pronunciation is otherwise unusual.
For example, the following JSON object literal defines approximate pronunciations for Voci, V-Blaze, and V-Cloud.
{ "vocab" : ["Voci Technologies", "V-Blaze transcription engine", "V-Cloud interface"],
"dict" : { "Voci" : ["vo chee","woe chee","vo see","woe see","vo sigh","woe sigh"],
"V-Blaze" : "vee blaze",
"V-Cloud" : "vee cloud"}
}
Download a copy of the file used in this example: example_oov.json
Key | Value | Description | Example |
---|---|---|---|
vocab |
string or list of strings |
Defines OOV terms, and contextual phrases in which OOV terms are likely to occur |
|
dict |
string or list of strings |
Defines OOV terms and their approximate pronunciations |
|
Example request
OOV dictionaries may be specified in-line with the request or as a file.
For example, this cURL request demonstrates passing an OOV dictionary as a file named example-oov.json
:
curl -F output=text -F oov="</path/to/example-oov.json" -F file=@example.wav https://asr.example.com/transcribe