Substitutions

Substitution rules contain lists of “target:replacement” mappings, which enable you to correct consistent and frequent transcription errors that result from out-of-vocabulary words, excess noise in the audio, poor enunciation, or strong accents. They can also be used to correct word combinations that rarely occur in general speech but occur frequently within a specific domain or company.

For example, useful substitutions in the insurance domain could include “giant whale : giant hail” and “hit a beer : hit a deer”.

Table 1. Substitutions Parameters

Name

Availability

Values

Description

subst

V‑Blaze 7.1+

true, false (default), none

The subst parameter can be used to enable or disable automatic system- and model-level substitutions.

subst=true

Enables system- and model-level substitutions

subst=false

Disables system-level substitutions; model-level substitutions still apply

subst=none

Disables both system- and model-level substitutions

This parameter is intended for debugging purposes only and should not be used in production.

substinfo

V‑Blaze 7.1+

true, false (default)

Provides substitution details in JSON transcripts.

Set substinfo to true to include a top-level JSON object that indicates the applied substitution rules and a number count for each rule.

In addition to the top-level JSON object, substinfo includes another JSON object in the metadata that details each substitution's location, the substitution rule applied, and the substitution rule source.

This parameter is intended for debugging purposes only and should not be used in production.

Tip: The information provided by the substinfo parameter is especially helpful for developing and debugging substitution configurations.

subst_list

V‑Blaze, V‑Spark

filename

The subst_list deployment method involves deploying a substitution file in the /opt/voci/state/substitutions/ directory on all ASR servers used for transcription. Refer to Substitutions for more information on creating and deploying substitution files.

Once the substitution file has been placed, set the value of the subst_list parameter to the name of the substitution file as shown in the example below.

subst_list=file1.sub,file2.sub
Note: If multiple substitution files are available, separate each file with a comma. The rules apply in the order specified.
Note: The subst_list method should only be implemented if multiple clients are utilizing the same ASR servers. Placing the substitution file on the ASR server ensures substitution rules are synchronized across multiple clients.

subst_rules

V‑Cloud, V‑Blaze 5.6+, V‑Spark

string/filename

Specifies a newline-delimited string that contains substitution rules to be applied during transcription. Rules can be specified directly as a string, however, the recommended approach is to auto-populate the string with the contents of a substitutions file. This approach makes it easier to edit and maintain these rules over time.

If you are using cURL, you can specify a file for the subst_rules parameter as shown in the following example.

subst_rules=</path/to/file.sub