Aggregations
Query API supports different types of aggregations:
-
aggregate
— Single-number aggregation (such as overall NPS). It returns a single value. -
aggregateList
— Returns a list of single number aggregations on records. -
aggregateRank
— Calculates the rank of one dimension/metric relative to all others. -
aggregateRankList
— Returns a list of aggregation ranks given by either multiple metrics applied to the same dataset, or multiple datasets with its own metric. These scenarios are mutually exclusive. -
aggregateTable
— Returns a table where each cell includes the aggregation for the given row and column. -
aggregateTableList
— Returns a list of aggregation tables given by rows, columns and metrics applied to a given dataset.
In general, aggregations have a definition which determines the operation, and another part that outlines the data extraction or data retrieval operation. The definitions change based on the type of aggregation needed.
Sample query
query{
survey_count:aggregate(
definition:{
data:{ # defines where the data is taken from
source: FEEDBACK # FEEDBACK or INVITATIONS
filter:{ # filters applied to apply to the aggregation response
fieldIds:"e_responsedate", lt:"2019-02-01", gt:"2018-07-01"
}
}
metric:{ # definition of the metric applied on the filtered data
count:{ # calculation
field:{ # field the calculation will be done on
id:"a_surveyid"
}
}
}
})
}
Sample response
"data":{
"Survey_count": 345
}
List of aggregations
An aggregateList
operation is useful to request several aggregate metrics with a single dataset definition.
The following example brings back an average
and a count
calculation in a single operation.
Sample query
query {
aggregateList(
definition: {
data: {
source: FEEDBACK
}
metrics: [
{ average: { field: { id: "q_bp_cc_ltr_scale" } } }
{ count: null }
]
}
)
}
Sample response
{
"data": {
"aggregateList": [
5.745756020529017,
112894
]
},
"errors": null,
"_links": null,
"_allowed": [
"POST",
"GET"
]
}
Ranked aggregations
An aggregateRank
operation is useful where the account has a limited scope (such as a call center agent being granted access to only her specific unit) and wants to rank that, relative to some larger dataset.
The following example is made using an account with access to a single unit, looking to rank that unit across all units available in a data view with ID "27".
Sample query
query {
aggregateRank(definition: {
fieldKey: "e_unitid",
metric: { average: { field: { id: "e_ltr" } } },
minimumSampleSize: 15,
myData: { source: FEEDBACK },
rankedData: { dataView: { id: "27" } }
}) {
rank
totalElements
metricValue
}
}
Sample response
{
"data": {
"aggregateRank": {
"rank": 263,
"totalElements": 831,
"metricValue": 8.658536585365853
}
}
}
Ranked aggregations like this are useful when trying to contextualize where your limited data view fits relative to the larger context's data view. If your account has full access to the data anyway, you may need to apply myData
filters to achieve the same result.
List of aggregation ranks
An aggregateRankList
operation is useful to request several rank metrics given by either multiple metrics applied to the same dataset, or multiple datasets with its own metric.
The following example brings back an average
and a count
calculation applied to the same dataset in a single operation.
Sample query
query {
aggregateRankList(definition: {
fieldKey: "e_unitid",
metrics: [
{ average: { field: { id: "q_bp_cc_ltr_scale" } } }
{ count: null }
],
minimumSampleSize: 15,
myData: { source: FEEDBACK },
rankedData: { dataView: { id: "28" } }
}) {
rank
totalElements
metricValue
}
}
Sample response
{
"data": {
"aggregateRankList": [
{
"rank": 134,
"totalElements": 200,
"metricValue": 5.615384615384615
},
{
"rank": 256,
"totalElements": 594,
"metricValue": 73
}
]
},
"errors": null,
"_links": null,
"_allowed": [
"POST",
"GET"
]
}
Getting an aggregate table
aggregateTable
is a generic representation to return any aggregation which could be shaped as a table. The definition has different attributes to use depending on the type of aggregation. Not all the attributes are needed, and there are different attributes to do the same thing. This provides flexibility to model the aggregation in different ways.
Along with this generic definition, the combinations of attributes will be validated in run-time, and appropriate ones are used when the query executes.
Sample query
Get a table of the count of invitations, and the average likelihood to recommend per gender, split by quarters.
query {
aggregateTable(definition:{
data:{
source: INVITATIONS
filter:{ # filters to narrow down the data set to apply the aggregation.
fieldIds:"e_responsedate", lt:"2019-02-01", gt:"2018-07-01"
}
}
columns:{ #defines how the columns are generated
segment:{field:{ id: "e_personalgender"}}
# Using segments, the columns are created using the options of the field.
# "Slices" can be used here to manually define the columns but in run-time only.
# one option is available. Either "slices" or "segments" should be used.
}
rows:{
slices:{items:[ # Manual row definition is done with slices. As in column
# definition. "Segment" could also be used but only one of
# those should be used at a time.
{key:"0" # index of the slice.
label:"Q1" # label for the slice.
filter:{fieldIds:"e_responsedate", gt:"2018-01-01", lte:"2018-03-31"} #data filter that defines the grouping criteria
},
{key:"1"
label:"Q2"
filter:{fieldIds:"e_responsedate", gt:"2018-04-01", lte:"2018-06-30"}
},
{key:"2"
label:"Q3"
filter:{fieldIds:"e_responsedate", gt:"2018-07-01", lte:"2018-09-30"}
},
{key:"3"
label:"Q4"
filter:{fieldIds:"e_responsedate", gt:"2018-10-01", lte:"2018-12-31"}
}
]}
}
metrics:[ # multiple metrics can be defined
{count:{field:{id:"a_surveyid"}}},
{average:{field:{id:"q_ltr_score"}}}
]
}
){
#Here starts the data retrieval part
columns{ #returns key and labels for columns. Using this query example, this returns e_personalgender options.
key
label
}
columnKeys
columnLabels
rows{ #returns key and labels for rows. Using this query example, this returns the rows defined manually as slices.
key
label
}
rowLabels #same as rows.label
rowKeys #same as rows.keys
metricsTable # returns a 3D array. first index is the metric defined in order (zero-based array). On each array position there will be the matrix with the aggregation.
metricTable(index:1) #only returns the matrix with the aggregations for "average".
}
}
Getting a list of aggregate tables
An aggregateTableList
operation is useful to request several aggregations which could be shaped as a table, given by either multiple definitions applied to the same dataset, or multiple datasets with its own definition.
In the example below, the first definition
brings a filtered set of surveys completed between January 1, 2018 and July 1, 2018 for each value of e_bp_cc_gender_alt
(the gender of the person who receives the survey) and uses the count metric applied to field q_bp_cc_ltr_scale
(likelihood to recommend). The second definition counts the number of invitation surveys there are in the same time period.
Sample query
query {
aggregateTableList(
definitions: [
{
data: { source: FEEDBACK filter: { fieldIds: "e_responsedate" gte: "2018-01-01" lt: "2018-07-01" } }
rows: {
segment: {field: {id: "e_bp_cc_gender_alt"}}
}
metrics: {
customCalculation: {
name: "bp_count"
field: { id: "q_bp_cc_ltr_scale" }
}
}
}
{
data: { source: INVITATIONS filter: { fieldIds: "e_creationdate" gte: "2018-01-01" lt: "2018-07-01" } }
metrics: {
count: null
}
}]
) {
metricsTable
}
}
Sample response
{
"data": {
"aggregateTableList": [
{
"metricsTable": [
[
[
696
]
],
[
[
675
]
]
]
},
{
"metricsTable": [
[
[
1090
]
]
]
}
]
},
"errors": null,
"_links": null,
"_allowed": [
"POST",
"GET"
]
}
Using R-Fields in Query API
R-Fields are cross-records aggregations that have defined calculations. Query API takes that and aggregates data using the calculation defined in the module.
To query R-Fields use the custom
metric (available to R-Fields only).
The metric may be used in any metric object defined for all types of aggregations.
Sample query - Simple aggregate operation
query aggregateNps {
aggregate(definition: {
data: {
source: FEEDBACK,
filter:{
fieldIds: "e_responsedate",
gte: "2017-08-01"
}
}
metric: {
custom: {
field: { id: "r_bp_branch_net_promoter_score" }
}
}
})
}
Sample response
{
"data": {
"aggregate": 42.671335667833915
},
"errors": null,
"_links": null,
"_allowed": [
"POST",
"GET"
]
}
Sample query - aggregateTable operation
query aggregateNpsTable{
aggregateTable(
definition:{
data: {
source: FEEDBACK
},
columnGroups: [
{ slices: { items: [
{ key:"MAY", label:"May", filter: { fieldIds: "e_responsedate", gte: "2018-05-25" lt: "2018-06-01" } },
{ key:"JUNE" , label:"June", filter: { fieldIds: "e_responsedate", gte: "2018-06-01" lt: "2018-07-01" } }
] } }
]
metrics: [
{ custom: {field: {id: "r_bp_branch_net_promoter_score"} } }
]
}) {
columns { key label }
totalColumns
metricTable(index:0)
countsTable
}
}
Sample response
{
"data": {
"aggregateTable": {
"columns": [
{
"key": "MAY",
"label": "May"
},
{
"key": "JUNE",
"label": "June"
}
],
"totalColumns": 2,
"metricTable": [
[
35,
14.379084967320262
]
],
"countsTable": [
[
198,
2823
]
]
}
},
"errors": null,
"_links": null,
"_allowed": [
"POST",
"GET"
]
}
Available metrics for aggregations
As mentioned above, aggregations have a definition which determines the calculation. Below are the calculations, or metrics, available for aggregation nodes:
-
average — Average of the values of a field.
-
sum — Sum of the values of a field.
-
count — Count of the total number of records containing values for a field, if the latter is not provided, it is a plain count of the total number of records.
-
customCalculation — Applies a custom calculation defined in the setup (which is referenced by its label) and applies it to values provided by a given field.
-
custom — Applies an r-field defined in the setup to the dataset.
-
countUnique — Count of distinct values for a given field.
-
bucketCount — Returns the total number of records matching a given field filter condition. Value hiding compares the minimum sample size with the total count of surveys that have any value for that field.
-
percentage — Calculates the cell record percentage relative to one axis subtotal. For example: if the table has
Genre
on rows andAge
on columns, and you need to calculate the row percentage of a cell with(Male-Young)
it does the division:-
numerator — The survey's count with
Male
andYoung
. -
denominator — The survey's count with male genre and non-null age.
Note: This metric is only available foraggregateTable
.
-
-
topicPercentage — Percentage 0-100 of surveys matching the corresponding TA topic / Number of surveys which do not specifically match it. Notice this metric is meant to be used in combination with filtering/splits by topic. In the context of phrase-level matching mode, if one of the co-ocurring topics — acting as the context topic — is defined in the query global filters, then this metric acts as the co-ocurrence between that topic and each topic from the rows/columns splits, so
topicPercentage(contextTopic, splitTopic)
is:topicPercentage = Number of surveys with contextTopic + splitTopic co-ocurring / Surveys mentioning splitTopic
-
globalRecordsTopicPercentage — An equivalent metric to topicPercentage, which is different when one of the data filters is wrapped by topicFilter ignoring those filters when computing the number of surveys from the denominator. It is intended to be used in phrase level filtering when a context topic is wrapped by topicFilter and it calculates
globalRecordsTopicPercentage(contextTopic, splitTopic)
as:globalRecordsTopicPercentage = Number of surveys with contextTopic + splitTopic co-ocurring / Number of surveys which do not specifically match them
-
topicImpact — A measurement of the impact of a given topic presence on a metric.
The formula for impact is:
topicMetricCount * ( topicMetric - metric ) / ( topicMetricCount - metricCount )
The impact formula which has 4 parameters:
-
Score on surveys matching the corresponding TA topic.
-
Score samples count on surveys matching the corresponding TA topic.
-
Score on surveys which do not specifically match the topic.
-
Score samples count on surveys which do not specifically match the topic.
When both numerator and denominator are zero the impact is “0” by definition.
Notice this metric is meant to be used in combination with filtering/splits by topic.
-
-
scaledTopicImpact — Same as the topicImpact metric but multiplied by 100.
sentimentCount — Number of surveys with positive/negative sentiment. If filtering/splitting by topic is present, only comments matching it are counted.
sentimentPercentage — Percentage of surveys with positive/negative sentiment / Number of surveys If filtering/splitting by topic is present, only comments matching it will be counted.
sentimentNetPercentage — Net percentage calculated as:
sentimentPercentage: [POSITIVE, STRONGLY_POSITIVE] - sentimentPercentage: [NEGATIVE, STRONGLY_NEGATIVE]
regressionBeta — Beta Calculation of a set of independent fields and a single dependent field.
regressionCorrelation — Regression correlation of a set of independent fields and a single dependent field.
segmentAverage — Average for the values of a given metric on a given segment.
segmentPercentile — Percentile for the values of a given metric on a given segment.
filtered — Wrapper that allows to filter data before applying a metric.
sumOfProduct — Sum of products of two fields. This is useful for computing regressions.