Adjusting for audio
Name |
Values |
Description |
---|---|---|
datahdr |
WAVE (Default for files with |
In V‑Blaze version 7.2+, Set Set Note: MP3 files do not work with V‑Blaze.
|
diarize |
false (default), true, noise |
Diarization is the process of recognizing distinct speakers on a single (mono) audio channel and segmenting detected speech into separate channels, which are identified in JSON output. Voci’s diarization capability is designed to do this for two speakers, typically a call agent engaged in a conversation with a client over the phone. You should only set
Enabling
The Note: Redaction accuracy is marginally reduced when used in combination with
diarize . Avoid diarization when using any of the redaction options for maximum redaction accuracy.Diarization is a licensed optional feature. |
SPCM, UPCM, ULAW, ALAW |
Specifies the algorithm used to encode the audio. Encoding must be supplied when raw or headerless audio is being transcribed. Refer to encoding for more information on this parameter. | |
endian |
LITTLE (default), BIG |
Specifies the byte ordering of audio samples. In a BIG endian data word the most significant byte comes first, when reading from left to right. In a LITTLE endian data word, the least significant byte comes first. By convention, LITTLE endian (the default) is the most common. This parameter is not required unless your audio uses BIG endian byte ordering. |
nchannels |
integer |
Required when doing real-time decoding when there is no data header. |
resample [INTERNAL ONLY] | true (default), false, | When resample=true, this enables resampling to 8000 Hz for all files with sample rates over 8000 Hz. Set resample=false to disable resampling. Set resample to an integer to resample to a given sample rate. |
samprate |
integer |
Specifies the sampling rate of the audio to be transcribed. Telephone audio is typically sampled at 8000 Hz. For best results, the sampling rate should be a multiple of 8000 (e.g., 8000, 16000, 24000, etc.). Values less than 8000 are not supported. The sampling rate must be supplied when raw or headerless audio is being transcribed. |
sampwidth |
integer |
Specifies the size of each digitized audio sample in bytes. This parameter is only applicable if the This parameter is only applicable—but must be supplied—when raw or headerless audio is being transcribed and the encoding parameter is set to either SPCM or UPCM. |