Redacting sensitive information from transcripts and audio
V‑Cloud’s redaction options can remove potentially sensitive numeric information from both transcripts and audio:
-
When redaction is enabled by setting
scrubtext
totrue
, all instances of sensitive numeric digits in output transcripts will be replaced by the hash sign (#). -
When redaction is enabled by setting
scrubaudio
totrue
, all audio segments containing sensitive numbers are replaced by silence. When using audio redaction without other customization options, results are returned as a zip archive containing transcripts and redacted MP3 files.
You can use the scrubconf option to specify the audio output format that you want to receive and the default allow list to use when scrubbing.
For example, to transcribe sample1.wav with text and audio redaction enabled, submit a request similar to the following:
curl -F token=your-token-here \
-F scrubtext=true \
-F scrubaudio=true \
-F file=@sample1.wav \
https://vcloud.vocitec.com/transcribe
The response to the request above should be a
requestid
that enables you to retrieve a results file containing both sample1.json and a redacted version of sample1.wav in mp3 format.