Common tags
Name | Values | Description |
---|---|---|
callback (optional) | URL: | The URL to which V-Blaze will POST transcripts. A callback is the address and (optionally) method name and parameters of a web application that can receive data via HTTP or HTTPS. Callbacks are usually used to enable another application to receive and directly interact with the transcripts produced. V‑Blaze transcripts are normally returned immediately and directly to the user or application that submitted the audio file for transcription. When a callback is specified, the resultant transcript is POSTed to the specified callback address and not returned in the response. V‑Blaze does not retry failed callbacks. |
file (required) | PCM audio data in WAVE or RAW format | A single audio file to process. |
model (optional) – | see language models | Indicates which language model(s) should be used to transcribe the audio. This parameter can be set to a single language model or a list of language models. If not specified, the default model will be used. Refer to model for more information on this parameter. |
(optional) | Values: json (default), jsontop, text, jsonlist, jsontext, noutt | Indicates the desired output format. Refer to output for more information on this parameter. |
realtime | false (default), true | Controls whether or not the ASR engine is processing incoming audio in real-time mode or not. Real-time mode is enabled based on a license setting and cannot be enabled using this setting if it is not enabled in the license. This tag is only useful to specify that the ASR engine not process incoming audio in real-time even though real-time is enabled in the license. |
requestid | The unique identifier for the request for tracing purposes. This can be specified as a parameter or in the X-Request-Id HTTP header. If a requestid is provided in one of these ways, the specified requestid is included in JSON output and in the WebAPI access log. Refer to requestid for WebAPI for more information on how to use requestid. |
model
Values: installation-dependent
Description:
The
model
parameter is used to specify the language model(s) to use for transcription. This parameter can be set to a single language model to transcribe all channels, or a comma-separated list of language models. V‑Cloud only supports a single language model for this parameter. Voci works with customers to ensure that their deployment delivers the best results possible, providing the language models that are most closely associated with the types of audio that each customer is transcribing. You will receive model names which are authorized for your account from Voci Support.
V‑Blaze supports a comma-separated list of models in channel order. For example, if the client is on channel 0 and the agent is on channel 1, you could use different models for each channel by setting the
model
parameter to
model=eng1:client,eng1:agent
. That setting would use the
eng1:client
language model to transcribe channel 0 and the
eng1:agent
to transcribe channel 1.
model
parameter.If you don't specify a value for the model parameter, the first available model of your configuration will be used. To determine the default model, use the
/models
API call as illustrated in the following example.
$ curl http://example:17171/models
{"models":["eng-us:callcenter","eng1:voicemail","eng1:survey"]}
As shown in the example above, if you did not specify a model when transcribing audio, the
eng-us:callcenter
model would be used.
Voci works with customers to ensure that their deployment delivers the best results possible, providing the language models that are best aligned with the business domain from which the speech originates.
Refer to Language models for more information on supported languages.
requestid for WebAPI
The unique identifier for the request for tracing purposes. This can be specified as a parameter or in the X-Request-Id HTTP header. If a
requestid
is provided in one of these ways, the specified
requestid
is included in JSON output and in the WebAPI access log.
The
requestid
is included in the final transcript and also in utterance callbacks as a top-level field.
The
requestid
can be anything. For example, you could pass an id to fetch metadata from a table or you could pass all the metadata in the
requestid
as a CSV string or any format you prefer, such as JSON. The following example shows a
requestid
:
$ curl -F "requestid=john,1234,567-uuid" -F "output=jsontop" -F "file=@/opt/voci/server/examples/sample1.wav" localhost:17171/transcribe; echo
{"source":"sample1.wav","confidence":0.89,"donedate":"2020-01-23 13:03:02.881927","requestid":"john,1234,567-uuid","recvtz":["EST",-18000],"text":"And that it was resolved in a very professional manner. Your employees a very good.","model":"devel:callcenter","recvdate":"2020-01-23 13:03:02.276387"}
The following is an example of the utterance callback JSON with
requestid
included:
{"source":"sample1.wav","utterance":{"confidence":0.89,"end":6.17,"recvtz":["EST",-18000],"text":"And that it was resolved in a very professional manner. Your employees a very good.","start":0.55,"donedate":"2020-01-23 13:04:47.875351","recvdate":"2020-01-23 13:04:47.274705","metadata":{"source":"sample1.wav","model":"devel:callcenter","uttid":0,"channel":0}},"requestid":"john,1234,567-uuid"}