Diarization is a language-independent process for evaluating a mono audio file. Diarization presumes two people are speaking and separates that mono audio into distinct channels by categorizing speech into two groups. One group is assigned to channel 0 and the other is assigned to channel 1 in the structured transcript.
The system may perform less effectively when source audio includes hold music, voice recordings, or more than two speakers. Overtalk may also reduce the overall accuracy. However, for typical agent and caller situations with only two speakers, diarization is very effective for separating a call into two distinct channels for enhanced analytics.
Using channel-separated audio will eliminate the possibility of channel-assignment errors and is therefore recommended.