Adjusting for audio

Table 1. Adjusting for different types of input

Name

Values

Description

diarize

false (default), true, noise

Diarization is the process of recognizing distinct speakers on a single (mono) audio channel and segmenting detected speech into separate channels, which are identified in JSON output. Voci’s diarization capability is designed to do this for two speakers, typically a call agent engaged in a conversation with a client over the phone.

You should only set diarize to true under the following conditions:

  • You know that your audio only contains a single audio channel.

  • You know that 2 people are talking on the channel.

  • Segregation of 2 speakers in the transcripts is important for your use case.

Enabling diarize will include the following fields in JSON output:

  • diascore — Indicates the system's level of confidence that it correctly classified detected speech into individual channels. The confidence level is expressed as a range between 0 and 1, where 1 indicates the best speaker separation. Refer to Confidence scores for more information on the confidence scoring system.

  • chaninfo — Provides additional information specific to each channel. chaninfo only appears for stereo or diarized audio. Refer to Top-level elements for more information.

The noise setting is typically not needed. However, if you are experiencing excessive diarization errors due to interference from non-speech sources, you can apply noise reduction by setting diarize=noise .

Note: Redaction accuracy is marginally reduced when used in combination with diarize . Avoid diarization when using any of the redaction options for maximum redaction accuracy.

Diarization is a licensed optional feature.

transcode

false, true

The default for transcode is true for tokens issued after February 15, 2022. Prior to that, the default was false.

Determines whether V‑Cloud should use its built-in decoders to convert incoming audio. When transcode=true, V‑Cloud will try to convert incoming audio into a supported format.

MP3 files are not supported by V‑Cloud unless transcode=true

Transcode functionality supports an extensive set of open audio formats. Submit your audio in a request to determine if the audio format is supported.

This option cannot be used with the truncate option. If transcode=true, truncate will be ignored.