params
URL: http://vblaze_name:17171/params
Example Response:
{
"params": {
"activitylevel": 175,
"bufmaxtime": 30,
"endian": "LITTLE",
"idletimeout": 30,
"languages": [
"eng1"
],
"model": "eng1:callcenter",
"models": [
"eng1:callcenter",
"eng1:survey",
"eng2:voicemail"
],
"numtrans": true,
"outputdir": "/opt/voci/ramfs",
"punctrailing": 12,
"punctuate": true,
"pushconntimeout": 5,
"queue": "bottom",
"raw_events": false,
"realtime": false,
"recvtimeout": null,
"scrubmindist": 0.3,
"uttmaxsilence": 800,
"uttmaxtime": 80,
"uttminactivity": 500,
"uttpadding": 300,
"vadparams": {}
}
}
Explanation:
The example response is a JSON object that shows the names of default system parameters that can be specified when initiating a transcription session and their default values.
activitylevel
Indicates the voice activity level below which the ASR engine considers any audio event to be silence.
This parameter is only applicable when
vadtype=level
. Refer to Voice Activity Detection and utterance controls for more information.bufmaxtime
Indicates the current buffer size configuration, which is the maximum amount of audio (seconds) to analyze for diarization.
endian
Indicates the byte ordering of audio samples. In a BIG endian data word, the most significant byte comes first, when reading from left to right. In a LITTLE endian data word, the least significant byte comes first. By convention, LITTLE endian (the default) is the most common.
idletimeout
Indicates the amount of time an API request will wait for a response before timing out.
languages
Indicates the languages that are available. The languages indicated are components of the model used by the ASR engine for transcription. These components convert the stream of sound symbols from an acoustic model into text.
models
Indicates the default language model, along with the models available for transcription requests.
The remaining JSON keys and values are default system parameters and their default values. Most of these parameters never require modification, but are provided to enable tuning for special circumstances, such as aggressive real-time applications.
numtrans
Indicates the default value of
numtrans
. Thenumtrans
parameter controls whether or not number words in transcribed text are converted into numeric digits and related conventional formats, including dollar amounts, wall-clock times, percentages, ordinals, and telephone numbers.outputdir
Indicates the directory where intermediate processing and result data is stored.
punctrailing
Indicates the amount of words required for a sentence to be created within the punctuation engine. If
punctrailing
is 12, then the punctuation engine waits until there are at least 12 words after a sentence before finalizing that sentence.punctuate
Indicates the default value of
punctuate
. Thepunctuate
parameter controls whether transcript text is punctuated or not. In most cases, it is desirable to leave punctuation turned on, but there are special cases where it should be turned off.pushconntimeout
Indicates the number of seconds to wait for a push data source to initiate a connection to the decode server.
pushconntimeout
cannot exceed 120 seconds.queue
Indicates the default value of
queue
. Thequeue
parameter determines the order of requests in transcription queue. Setting the value tobottom
inserts the stream at the end of the queue which means transcripts are processed in the order they are received. Setting the value totop
inserts the stream at the beginning of the queue. Settingqueue
totop
is useful for skipping the queue when submitting high priority jobs.raw_events
Indicates the default value of
raw_events
. Theraw_events
parameter includes an additionalraw_events
list in the JSON output underutterances
. This includes silence, filler words, wordex, and un-punctuated text.realtime
Indicates the default value of
realtime
. Therealtime
parameter controls whether or not the ASR engine is processing incoming audio in real-time mode or not.recvtimeout
Indicates the amount of time (milliseconds) before timing out when receiving audio data. The default value is set to
null
or0
for no timeout.scrubmindist
Indicates the default value of
scrubmindist
. Thescrubmidist
parameter specifies the number of seconds within which two scrubbed audio sections will be merged when thescrubaudio
parameter is set totrue
.uttmaxsilence
Indicates the default value of
uttmaxsilence
. Theuttmaxsilence
parameter specifies the maximum amount of silence in milliseconds that can occur between speech sounds without terminating the current utterance. Once a silence occurs that exceedsuttmaxsilence
milliseconds, an utterance “cut” is made within the detected silent region.This parameter is only applicable when
vadtype=level
. Refer to Voice Activity Detection and utterance controls for more information.uttmaxtime
Indicates the default value for
uttmaxtime
. Theuttmaxtime
parameter specifies the maximum amount of time in seconds that is allotted for a spoken utterance. Normally an utterance is terminated by a sufficient duration of silence, but if no such period of silence is encountered prior to reachinguttmaxtime
, the utterance is terminated forcibly.uttminactivity
Indicates the default value of
uttminactivity
. Theuttminactivity
parameter specifies how much activity is needed (withoututtpadding
) to classify as an utterance.uttpadding
Indicates the default value of
uttpadding
. Theuttpadding
parameter specifies how much padding around the active area to treat as active. Typically the higher the activitylevel, the more padding is needed. Lower activity levels require less padding.This parameter is only applicable when
vadtype=level
. Refer to Voice Activity Detection and utterance controls for more information.vadparams
Indicates the parameters configured as default for the voice activity detection (VAD) component of V‑Blaze. VAD is controlled by both the ASR engine and the language model used for the request, and some VAD settings vary by language model.
Both audio-independent and audio-dependent parameters are discussed in V‑Blaze transcription parameters.