HTTP results streaming
HTTP results streaming in the V‑Blaze REST API allows for a simpler, less powerful, alternative to WebSockets for receiving utterance results in real time. When combined with chunked transfer uploads, a bidirectional streaming interface over HTTP can be realized. This option is not supported if scrubaudio
or zip
output is requested.
There are several different flows that allow for HTTP result streaming:
Scrubbed audio may be streamed over HTTP if scrubaudio=true
, notext=true
, and outstream=true
are all specified. The resulting stream will contain uncompressed scrubbed audio in WAV format.
Transcription result streaming
The transcription result streaming flow sends utterances results back to the user in a line-delimited JSON format. Each utterance is delimited by a '\r\n' (CRLF). After all utterances are streaming, two CRLFs are sent followed by the complete transcription. The format in which the utterances and complete transcription are sent in can be controlled using the
utterance_fmt
and
output
tags.
This flow is enabled by default when
realtime=true
and no
utterance_callback
is provided. It can always be disabled by specifying an
outstream=false
tag.
If
outstream=true
is specified without
realtime=true
, utterances will be streamed back in the format described above; however, this stream will not occur in realtime.
Audio result streaming
Scrubbed audio may be streamed over HTTP if scrubaudio=true
, notext=true
, and outstream=true
are all specified. The resulting stream will contain uncompressed scrubbed audio in WAV format.
Transcription result streaming with redacted audio
If both
scrubaudio
and
outstream
are true but
notext
is not specified, the utterance transcriptions will be streamed back in the format described above; however, instead of being followed by just the complete transcription, a ZIP file containing both the complete transcription and the redacted audio will be sent.
Note that this flow can not stream redacted audio in realtime. The final ZIP file is only sent after all audio data has been processed. WebSockets must be used if both real-time utterance and redacted audio streaming is necessary.