Speech API

Speech API triggers audio-file processing in Medallia Speech for the purpose of creating a feedback record in Experience Cloud.

Once audio files have been uploaded to Experience Cloud, use this API to trigger file processing in Medallia Speech, which will create a feedback record for each call.

Note: To use this API, customers must have already transferred the call recording files to an intermediary storage system between their network and the Experience Cloud instance. Medallia recommends that you use a Medallia Media File Transfer storage bucket for this purpose.

Additionally, there is a Medallia connector that uses this mechanism to fetch data from the Media File Transfer storage bucket for ingestion.

Restrictions and limits

The API supports bulk uploads and can handle up to 1,000 records per request. Additionally, there is a limit to the number of API calls an app can make within a given time period:

Up to 13,000 requests per minute
Up to 325,000 requests per 24-hour period

Important: The timeout for a request is 180 seconds. The API Gateway discards any request that takes longer than that. However, the processing may take longer than 180 seconds: consider checking in reports if the files have been processed before sending another request.

Authentication and authorization

Authentication identifies who is making an API request, and authorization identifies what data the requester may access. OAuth is an industry standard for authorizing limited access to services and data. Applications must obtain a secure token that identifies the application that makes the request. The token is passed to the resource server (API server) with each API request. For more information, see Authenticate APIs with OAuth.

To use the Speech API:

The application must have an account.
The account's role must have permission to access the API.
API access is authenticated with OAuth. To use OAuth, the application must first obtain an OAuth access token, by requesting one for the application's client ID and secret. For detailed information, see Authenticating APIs with OAuth.

Request/response formats

Requests sent to Medallia Speech API are HTTP POST protocol. The information in the request includes:

A required Content-Type header field describing the content. The acceptable type is: application/json for JSON.

An optional Accept field tells the Speech API how to format the response. The acceptable type is: application/json for JSON.


Parameter	Description	Required	Values
Bearer	Access token	Required	See Authentication and authorization.
Content-Type	Format of request data	Required	application/json.
Accept	Format of response data	Optional	application/json.

No other header fields are expected.
The request URL:
- Always use the base instance for the company's Medallia installation.

URL and endpoint

The Speech API accesses resources from a URL that follows this format:

https://<api-host>/<service>/<api-version>/<endpoint>

Where:

api-host is the server for your company's Medallia Experience Cloud instance. For detailed information about identifying the host, see API hosts.
service is speech.
api-version is v0.
endpoint is bulk-ingest.

POST body

This API is used to provide context for voice signals by sending metadata associated with the signals (audio files) in the body of the HTTP POST request.

Note: The following table lists the parameters that can be sent for each call record. Medallia recommends that you include these additional metadata values in your API request to ensure your audio is being transcribed more precisely to meet your business needs.

The request body should be encoded in a JSON array of objects, with each object containing keys and values that match the parameters shown in the following table:


Parameter	Description	Type	Required	Notes
call_identifier	A unique record identifier, encoded as a JSON string.	String	Required	It can be the Universal Call ID (UCID) or some other similar tracked value. Important: Make sure you can track this parameter, since it can later be used as an external ID for the file during import or export operations. For example: 12345-67890-1234567890.
speech_file_name	Name of the audio file associated with the record, encoded as a JSON string.	String	Required	S3 supports the use of forward slashes in file names to simulate folders. If your company uses this feature, you must include the full path in the Speech File Name. For example: audio/2020-07-03!1000/T15996_A.wav.
unit_identifier	ID of the agent that handled the call (typically the last agent the customer is transferred to, if there are multiple agents), encoded as a JSON string.	String	Optional	This must match the ID that is included in the organizational hierarchy for the agent. Note: You can supplement with additional unit fields through the transfer of custom metadata. For example, if your company is using Apps (formerly known as Best Practice Packages) that have a different unit field, you can include that field as metadata.
call_date_and_time	Date and time of the interaction, encoded as a JSON string of an ISO-8601 timestamp.	Datetime	Required	Format is `yyyy-MM-dd HH:mm:ssZZ` (e.g. 2016-01-01 11:30:00-0800).
engine	The speech-to-text transcription engine to use for the call, encoded as a JSON string.	String	Required	The default value is "Engine1". The accepted values are: Engine1 — Voci engine Engine2 — Amazon Transcribe engine Engine3 — Speechmatics engine
call_recording_url	URL to an external resource of the call interaction recording, encoded as a JSON string.	String	Optional	This is typically used to reference back to the source (third-party) system. Note: This URL is not used to download the call recording. It is intended as a clickable link from Medallia Reporting to the source system.
vertical_model	Medallia Speech vertical model to use for analyzing the call contents, encoded as a JSON string.	String	Optional	The default value is “Call Center”. The accepted value is "Call Center".
locale	Primary language spoken by the customer during the call, encoded as a JSON string of ISO 639-1 values.	String	Optional	The default value is “en-US”. The accepted ISO 639-1 values are: en-US en-GB en-AU es-US es-ES es-MX fr-CA fr-FR de-DE it-IT pt-BR pl-PL (only available for Engine2) sv-SE (only available for Engine2) el-GR (only available for Engine2) ko-KR only available for Engine2) zh-TW (only available for Engine2) en-IN (only available for Engine2) ar-AE (only available for Engine2) zh-CN (only available for Engine2) ja-JP (only available for Engine2) Note: When the `engine` is `Engine1`, if the field `agent_locale` has a value, then `locale` will be used for customer channel and `agent_locale` for agent channel. Otherwise, `locale` will be used as media language for the entire file.
agent_locale	Primary language spoken by the agent during the call, encoded as a JSON string of ISO 639-1 values.	String	Optional	The default and accepted values are the same that can be sent for `locale`. Note: This parameter is available only when the `engine` is "Engine1".
apply_diarization	Boolean that determines whether diarization needs to be applied to the audio file during processing, encoded as a JSON string.	String	Optional	Diarization presumes two people are speaking, and separates mono audio recordings into distinct channels by categorizing speech into two groups. So, this setting only applies to mono-channel recordings, which need to get diarized. The default value is “No”. The accepted values are: Yes No
agent_channel	Determines which of the 2 channels (0 or 1) is associated with the agent. The other channel is associated with the customer. Note: The initiator of a call is assigned to channel 0. For inbound calls, set the agent channel to 1. For outbound calls, set the agent channel to 0.	String	Optional	Must be mapped during Auto-Importer processing. Confirm how your telephony system records data to audio channels to properly set this value. The default value is “0”. The allowed values are: 0 1
substitutions	The set of transcription substitutions to make during processing, encoded as a JSON object of key/value pairs. Substitutions can correct errors in transcripts using substitution rules that find and replace transcription errors with corrected values.	Substitution data object	Optional	The format is that of a JSON object, where the keys are the original versions to find and the values are the replacement versions. See the example below for proper formatting: {"appeal box":"a PO box","triple A batteries":"AAA batteries"} Important: Substitution rules are processed as part of the call made to the Speech API, and therefore cannot be applied to historical data. If you need to apply new substitution rules to data already transcribed by Speech, you must resend the associated audio file through the API.
apply_redaction	Boolean that determines whether redaction is performed on the audio and its transcription, encoded as a JSON string.	String	Optional	By default, if no value is set, redaction is set to “Yes”. Restriction: Redaction is not available for the Amazon Transcribe engine. The allowed values are: Yes No Note: For security purposes, Medallia Speech automatically redacts credit card numbers, Social Security Numbers, and street addresses from the transcription and playback audio. If your company wishes to keep that information visible in Experience Cloud, set `"apply_redaction": "No"` as part of the transcription API request.
first_name	First name of the customer, encoded as a JSON string.	String	Optional	Only required if a followup survey is being sent for the Contact Center interaction, since this would be necessary for the email invitation.
last_name	Last name of the customer, encoded as a JSON string.	String	Optional	Only required if a followup survey is being sent for the Contact Center interaction, since this would be necessary for the email invitation.
email	Email address of the customer, encoded as a JSON string.	String	Optional	Only required if a followup survey is being sent for the Contact Center interaction, since this would be necessary for the email invitation.
phone_number	Phone number of the customer, encoded as a JSON string.	String	Optional	This allows closed-loop feedback processes to have the customer phone number available when applicable. It can be based on the ANI (Automatic Number Identification).
connection_id	Unique identifier of the connection profile. For more information see Implement Speech.	String	Optional	This property is set automatically when you create a new connection profile.
connector_id	Unique identifier of a specific Speech Connector API as configured in Medallia Admin Suite.	String	Optional Important: When using the Speech API, if one or more Speech API type connectors are configured, this parameter is required. In this scenario, the API fails if the `connector_id` is not provided.	If set, the Medallia Speech data in the API request is routed through the connector for processing, including extra metadata available in `speech_additional_info`.
speech_additional_info	Additional information specific to each speech vendor. Use this parameter to send additional call audio metadata.	Information data object	Optional	See the example below for proper formatting: {"queue_name":"Bank","queue_id":"12","direction":"Inbound","skill":"Bank","agent_first_name":"Gordon","agent_last_name":"Gekko"} This option enables clients to augment the default set of call metadata. Restriction: Use of this field requires setting a `connector_id`.

Response

The API is synchronous; the response is a JSON object that includes:


Element		Description	Type	Notes
job_id		UUID of the transcription job.	String	Medallia recommends storing this value for troubleshooting and auditing purposes.
status		The overall status of the processing of the request. This value represents whether the basic requirements were met to accept the file for processing; it does not indicate that the transcription will succeed.	String	Values: ACCEPTED PARTIALLY_ACCEPTED REJECTED Note: A status of ACCEPTED or REJECTED means all the call entities provided in the request take on that status. A status of PARTIALLY_ACCEPTED means that there is a difference in status on particular call entities in the request, and the `details` array should be parsed for further status on each.
details		An array of details related to the call entities from the request.	File processing data object	This element is only returned when a specific file or several files could not be processed (when the status is PARTIALLY_ACCEPTED). See Error handling.
	call_identifier	Unique record identifier.	String	This value is used to associate the response in the details array with the Medallia Speech API response details.
	speech_file_name	Filename of the audio file on the Medallia Media File Transfer system that is associated with the record.	String	—
	status	Status of the record processing.	String	Values: ACCEPTED REJECTED
	error_message	Brief and human-readable description of the error that occurred.	String	—

Sample requests

The following samples show how to format the body of the request when using the Speech API to transfer call data.

Sample request - Speech API call with one record, all fields

POST https://instance​.apis.medallia.com/speech/v0/bulk-ingest
Content-Type: application/json

[
   {
       "call_identifier": "0696e114-b819-11ea-b3de-0242ac130004",
       "speech_file_name": "T15584.wav",
       "unit_identifier": "wm_advisor_1",
       "call_date_and_time": "2020-06-24T13:02:00-03:00",
       "engine": "Engine1", 
       "call_recording_url": "https://www.youtube.com/watch?v=6CcCyYSMwnQ",
       "vertical_model": "Call Center",
       "locale": "en-US",
       "agent_locale": "en-US",
       "apply_diarization": "No",
       "agent_channel": "0",
       "substitutions": {"appeal box":"a PO box","triple A batteries":"AAA batteries"},
       "apply_redaction": "Yes",
       "first_name": "Janelle",
       "last_name": "Perry",
       "email": "janelle.perry@mail.com",
       "phone_number": "555-555-5555",
       "connection_id": "79589f06-ad70-4621-bdf3-37ef1c693ff0",
       "connector_id": "68478g15-be61-3512-ceg2-26de2b782gg1",
       "speech_additional_info": {"queue_name":"Bank","queue_id":"12","direction":"Inbound","skill":"Bank","agent_first_name":"Gordon","agent_last_name":"Gekko"}
   }
]

Sample response - Accepted

{
   "job_id": "713c865e-0d1f-43a0-9998-5fada657850b",
   "status": "ACCEPTED"
}

Sample request - Speech API call with 6 records

POST https://instance​.apis.medallia.com/speech/v0/bulk-ingest
Content-Type: application/json

[
   {
       "call_identifier": "8a98e8f7-f815-4247-90da-57ec53da6c50",
       "speech_file_name": "T15987A.wav",
       "unit_identifier": "wm_advisor_1",
       "call_date_and_time": "2020-06-24T13:02:00-03:00",
       "engine": "Engine1",
       "call_recording_url": "https://www.youtube.com/watch?v=6CcCyYSMwnQ",
       "apply_redaction": "Yes",
       "substitutions": {"sub1":"subA"},
       "locale": "en-US",
       "agent_locale": "en-US"
   },
   {
       "call_identifier": "7f06bf84-98fc-4776-8617-033022819c9c",
       "speech_file_name": "T15987B.wav",
       "unit_identifier": "wm_advisor_1",
       "call_date_and_time": "2020-06-24T13:02:00-03:00",
       "engine": "Engine1",
       "apply_redaction": "Yes",
       "first_name": "Janelle",
       "last_name": "Perry"
   },
   {
       "call_identifier": "0e04a5a9-bab2-4644-9aab-04e887213e50",
       "speech_file_name": "T15987C.wav",
       "unit_identifier": "svc_tech_103",
       "call_date_and_time": "2020-06-24T13:02:00-03:00",
       "engine": "Engine1",
       "email": "janelle.perry@mail.com",
       "phone_number": "555-555-5555"
   },
   {
       "call_identifier": "504b4661-30f5-46f3-b833-c008d5c9b8c6",
       "speech_file_name": "T15987D.wav",
       "unit_identifier": "svc_tech_103",
       "call_date_and_time": "2020-06-24T13:02:00-03:00",
       "engine": "Engine1",
       "locale": "en-US",
       "agent_channel": "0",
       "substitutions": null,
       "apply_redaction": "Yes"
   },
   {
       "call_identifier": "e7cd7a2b-fd51-4028-99db-4bc13aa85ffd",
       "speech_file_name": "T15987E.wav",
       "unit_identifier": "svc_tech_103",
       "call_date_and_time": "2020-06-24T13:02:00-03:00",
       "engine": "Engine1",
       "call_recording_url": "https://www.youtube.com/watch?v=6CcCyYSMwnQ",
       "vertical_model": "Call Center"
   },
   {
       "call_identifier": "75285c45-0675-4325-95a8-867a1575f074",
       "speech_file_name": "T15987F.wav",
       "unit_identifier": "svc_tech_103",
       "call_date_and_time": "2020-06-24T13:02:00-03:00",
       "engine": "Engine2",
       "call_recording_url": "https://www.youtube.com/watch?v=6CcCyYSMwnQ",
       "vertical_model": "Call Center",
       "apply_diarization": "No",
       "apply_redaction": "Yes",
       "email": "janelle.perry@mail.com"
   }      
]

Sample response - Accepted

{
   "job_id": "84c18904-b2a2-445b-88c2-efe8ccb0cab9",
   "status": "ACCEPTED"
}

Sample request - Speech API call with 5 records

POST https://instance​.apis.medallia.com/speech/v0/bulk-ingest
Content-Type: application/json

[
   {
       "call_identifier": "cb80e932-9a86-45c5-af83-4a0441daca3b",
       "speech_file_name": "T15560A.wav",
       "unit_identifier": "wm_advisor_1",
       "call_date_and_time": "2020-06-24T13:02:00-03:00",
       "engine": "Engine2"
   },
   {
       "call_identifier": "6ff2ce14-3be8-43ba-949b-f00254a7be3f",
       "speech_file_name": "T15560B.wav",
       "unit_identifier": "wm_advisor_1",
       "call_date_and_time": "2020-06-24T13:02:00-03:00",
       "engine": "Engine2"
   }, 
    {
       "call_identifier": "e5265198-4a43-4e17-8f7e-60a0c8146401",
       "speech_file_name": "T15560C.wav",
       "unit_identifier": "wm_advisor_1",
       "call_date_and_time": "2020-06-24T13:02:00-03:00",
       "engine": "Engine2",
   },
    {
       "call_identifier": "f2796561-9c7e-4825-bee3-769a954e1c66",
       "speech_file_name": "T15560D.wav",
       "unit_identifier": "wm_advisor_1",
       "call_date_and_time": "2020-06-24T13:02:00-03:00",
       "engine": "Engine2"
   },
    {
       "call_identifier": "1a87c03c-d148-4b70-b745-eeca94b1cc0b",
       "speech_file_name": "T15560E.wav",
       "unit_identifier": "wm_advisor_1",
       "call_date_and_time": "2020-06-24T13:02:00-03:00",
       "engine": "Engine1"
   }           
]

Sample response - Partially accepted

{
   "job_id": "377533aa-d342-4a6a-8af3-724a91587bb1",
   "status": "PARTIALLY_ACCEPTED",
   "details": [
       {
           "call_identifier": "cb80e932-9a86-45c5-af83-4a0441daca3b",
           "speech_file_name": "T15560A.wav",
           "status": "ACCEPTED"
       },
       {
           "call_identifier": "6ff2ce14-3be8-43ba-949b-f00254a7be3f",
           "speech_file_name": "T15560B.wav",
           "status": "ACCEPTED"
       },
       {
           "call_identifier": "e5265198-4a43-4e17-8f7e-60a0c8146401",
           "speech_file_name": "T15560C.wav",
           "status": "REJECTED",
           "error_message": "File T15560C.wav does not exist"
       },
       {
           "call_identifier": "f2796561-9c7e-4825-bee3-769a954e1c66",
           "speech_file_name": "T15560D.wav",
           "status": "ACCEPTED"
       },
       {
           "call_identifier": "1a87c03c-d148-4b70-b745-eeca94b1cc0b",
           "speech_file_name": "T15560E.wav",
           "status": "REJECTED",
           "error_message": "File T15560E.wav does not exist"
       }
   ]
}

Error handling

There are several types of errors that can happen when calling the API:

Client problems e.g., rate-limited, unauthorized, etc. (4xx HTTP codes).
The body of the request fails internal validation (syntax, formatting, etc.).
The user-supplied parameters or context are bad (cannot find the specified file in the storage system).
One or more Speech API type connectors are configured, but you have not provided a connector_id.

For errors 1 and 2 above, the application gets an HTTP error, so the response won't be a Speech API response because the error happened before processing the request.

Sample request - Speech API call with one record - Invalid payload

POST https://instance​.apis.medallia.com/speech/v0/bulk-ingest
Content-Type: application/json

[
   {
       "call_identifier": "0696d868-b819-11ea-b3de-0242ac130004",
       "speech_file_name": "file.wav",
       "unit_identifier": "wm_advisor_1",
       "call_date_and_time": "2020-06-24T13:02:00-03:00",
       "engine": "Engine1"
  
]

Sample response - Status 400 - Bad request

{
   "error_type": "invalid_input",
   "message": "Input request body is not valid",
   "_allowed": []
}

For error 3, the response will have a HTTP 200 code and the response will have a mix of file data and error messages. This is because the API will try to process the request, and will return details when something goes wrong.

Sample request - Speech API call with one record - Missing file

POST https://instance​.apis.medallia.com/speech/v0/bulk-ingest
Content-Type: application/json

[
   {
       "call_identifier": "ab8c2994-9359-4269-ba4f-cd4c0d53635e",
       "speech_file_name": "T15525.wav",
       "unit_identifier": "wm_advisor_1",
       "call_date_and_time": "2020-06-24T13:02:00-03:00",
       "engine": "Engine1"
   }
]

Sample response - Status 200 - OK

{
   "job_id": "68a49d2d-71ba-4dda-a27b-85d17039d2cc",
   "status": "REJECTED",
   "details": [
       {
           "call_identifier": "ab8c2994-9359-4269-ba4f-cd4c0d53635e",
           "speech_file_name": "T15525.wav",
           "status": "REJECTED",
           "error_message": "File T15525.wav does not exist"
       }
   ]
}

Note: The service does not validate if duplicate parameters are sent in the body of the request, or across several requests. If the same request is sent more than once, all will be accepted.

For error 4, the Speech API fails with the following if the connector_id is not provided:

{
   "job_id": "7b721583-cb4f-48bb-8861-426ed0cf8719",
   "status": "REJECTED",
   "details": [
       {
           "call_identifier": ""12042024-454004002450966-record-cv20240413001",
           "speech_file_name": "Audio/454004002450966.wav",
           "status": "REJECTED",
           "error_message": "Send failed; nested exception is org.apache.kafka.common.errors.SaslAuthenticationException: {\"status\":\"invalid_token\"}"
       }
   ]
}

Speech API payload examples

The following samples show how to format the body of the request when using the Speech API in different contexts.


Speech API Payload	Speech API payload + Connection ID profiles in Setup	Speech API payload for connectors + metadata
Loaded via Auto Importer.[ { "call_identifier": "cb80e932-9a86-45c5-af83-4a0441daca3b", "speech_file_name": "T15560A.wav", "unit_identifier": "wm_advisor_1", "call_date_and_time": "2020-06-24T13:02:00-03:00", "engine": "Engine2", "agent_channel": "1", "apply_redaction": "Yes", "apply_diarization": "No", "locale": "en-US", "substitutions": {"appeal box":"a PO box","triple A batteries":"AAA batteries"}, }, ]	Loaded via Auto Importer.[ { "call_identifier": "cb80e932-9a86-45c5-af83-4a0441daca3b", "speech_file_name": "T15560A.wav", "unit_identifier": "wm_advisor_1", "call_date_and_time": "2020-06-24T13:02:00-03:00", "engine": "Engine2", "agent_channel": "1", "apply_redaction": "Yes", "apply_diarization": "No", "locale": "en-US", "substitutions": {"appeal box":"a PO box","triple A batteries":"AAA batteries"}, "connector_id" : "e8274f70-981c-11ed-938e-9f223223dd53", }, ]	Metadata added to payload via `speech_additional_info` mapped to fields through connector data mappings.[ { "call_identifier": "cb80e932-9a86-45c5-af83-4a0441daca3b", "speech_file_name": "T15560A.wav", "unit_identifier": "wm_advisor_1", "call_date_and_time": "2020-06-24T13:02:00-03:00", "engine": "Engine2", "agent_channel": "1", "apply_redaction": "Yes", "apply_diarization": "No", "locale": "en-US", "substitutions": {"appeal box":"a PO box","triple A batteries":"AAA batteries"}, "connector_id" : "e8274f70-981c-11ed-938e-9f223223dd53", "speech_additional_info": { "RECORDINGID": "62d53717-5a2e-d006-8e18-32e92cef00", "RECORDINGDATE": "2023-01-19T21:26:44Z", "LINENAME": "SMP_LEGACY_MN_1", "CALLID": "2003735217", "CALLTYPE": "External", "CALLDIRECTION": "Inbound", "STATIONID": "MN0136", "LOCALNAME": "Dacy Hanson", "ASSIGNEDWORKGROUP": "STARTTIME": "2023-01-06T21:26:44Z", "REMOTENAME": "GEORGE TWIGG", "ENDTIME": "2023-01-06T21:28:59.23Z", "CALLDURATIONSECONDS": "180", "DNIS": "2172389143", "SKILLSET": "CS_RES_NNE_FIBER", "DISCOTYPE": "Remote Disconnect", }, ] Tip: While `engine`, `vertical model`, `locales`, `diarization`, `redactions`, `substitutions` and other values can be defined, currently, there's no merge mechanism between the Speech API payload and connector settings, so Experience Cloud will use the the values defined in the Speech API payload if present.