Redacting sensitive information from transcripts and audio

V‑Cloud’s redaction options can remove potentially sensitive numeric information from both transcripts and audio:

  • When redaction is enabled by setting scrubtext to true , all instances of sensitive numeric digits in output transcripts will be replaced by the hash sign (#).

  • When redaction is enabled by setting scrubaudio to true , all audio segments containing sensitive numbers are replaced by silence. When using audio redaction without other customization options, results are returned as a zip archive containing transcripts and redacted MP3 files.

You can use the scrubconf option to specify the audio output format that you want to receive and the default allow list to use when scrubbing.

For example, to transcribe sample1.wav with text and audio redaction enabled, submit a request similar to the following:

curl -F token=your-token-here \
     -F scrubtext=true \
     -F scrubaudio=true \
     -F file=@sample1.wav \
     https://vcloud.vocitec.com/transcribe

The response to the request above should be a requestid that enables you to retrieve a results file containing both sample1.json and a redacted version of sample1.wav in mp3 format.