dbi Blog

In the previous post, I walked through the query side of the dashboard JSON: objectType, class, filters, date tokens. What I deliberately left out was the aggregation block, because it deserves its own post.

The aggregation is what transforms the set of matching objects into a value the widget can render. There are five aggregation types, six reducers, and one optional second dimension (seriesProperty). Together, they cover every widget shape from a single KPI number to a cross-tab pivot table.

1. The two-dimensional model

The aggregation has two complementary components:

The aggregation type is the shape of the result: one value (summary), one value per group (groupByProperty), one value per time bucket (groupByDateBucket), one value per admin-defined range (groupByRange), or a simple list of objects (list).
The aggregation reducer is the operation applied within each group (or over the whole matching set for summary): count (default), sum, avg, min, max, median. Note: The reducer is ignored for list.

In other words, type decides whether you get a number, a chart, or a table; reducer decides what that number, bar, or cell actually measures.

2. The five aggregation types

2.1. summary – one value over everything

summary reduces the entire matching set into a single value. By default it counts the matching objects. However, with a different reducer, you can make it compute a property aggregate in different ways. Here are a few example of summary aggregations:

// Total count of matching objects
"aggregation": { "type": "summary" }

// Total revenue of matching objects
// (sum all values from the "Amount" property)
"aggregation": {
  "type": "summary",
  "reducer": "sum",
  "reducerProperty": "Amount"
}

// Latest contract expiry date
// (display only the date with the highest value - drill-through still list all objects)
"aggregation": {
  "type": "summary",
  "reducer": "max",
  "reducerProperty": "Effective through"
}

The widgets that accept summary are kpiNumber, gauge, and table. If you followed this series, you probably saw a bunch of the first two, already. The last one, in this case, will render the value as a one-row table. For all other chart widgets (donut, bar, line, area), summary does not really make visual sense, as a single value cannot be plotted on an X-Y axis.

2.2. groupByProperty – one value per group

groupByProperty groups objects by the distinct values of a property (the propertyName) and applies the reducer to each group independently. Here are two examples:

// Count of matching objects per Agreement type
"aggregation": {
  "type": "groupByProperty",
  "propertyName": "Agreement type",
  "includeEmptyResults": "No"
}

// Total revenue of matching objects per Customer
// (sum all values from the "Amount" property for each Customer independently)
"aggregation": {
  "type": "groupByProperty",
  "propertyName": "Customer",
  "reducer": "sum",
  "reducerProperty": "Amount"
}

The supported fields are the following ones:

Field	Required	Description
*propertyName*	Yes	Property to group by (any type is supported)
*reducer*	No	Defaults to *count* (c.f. section 1 above)
*reducerProperty*	Yes, when reducer != count	Property to reduce within each group. Can be the same as *propertyName* but it can also be different (c.f. above)
*includeEmptyResults*	No	When set to *“Yes”, it adds a (none)* group which will contain objects without value (e.g. no value for “Customer” property)
*seriesProperty*	No	Splits the chart into multiple series (c.f. section 4 below)

The widgets that accept groupByProperty are donut, bar, line, area and table. Basically, all charts widgets, plus the table which is kind of a Swiss knife, that works with everything.

2.3. groupByDateBucket – one value per time period

It is very similar to the groupByProperty, but the target must be a date / timestamp. Basically, something that contains a date, whether its only the date or a full date-time is fine. Because dates would probably be a bit too wide, there is a concept of buckets, to group dates by a pre-defined range that might make sense. The default time periods are day, week, month, quarter or year. Here are two examples:

// Count of contracts expiring per month
"aggregation": {
  "type": "groupByDateBucket",
  "propertyName": "Effective through",
  "bucketSize": "month",
  "includeEmptyResults": "Yes"
}

// Sum of invoice amounts per quarter
"aggregation": {
  "type": "groupByDateBucket",
  "propertyName": "Invoice date",
  "bucketSize": "quarter",
  "reducer": "sum",
  "reducerProperty": "Amount"
}

The supported fields are pretty similar to the groupByProperty:

Field	Required	Description
*propertyName*	Yes	Property to group by (date / timestamp only)
*bucketSize*	No	Defaults to *month*, the size of the range to group by
*reducer*	No	Same as *groupByProperty*
*reducerProperty*	Yes, when reducer != count	Same as *groupByProperty*
*includeEmptyResults*	No	Same as *groupByProperty, but “Yes”* will also fill gaps with zero values (e.g. a month without revenue is still displayed as *“0”*, it’s not silently ignored)
*seriesProperty*	No	Same as *groupByProperty*

Note: you would usually combine groupByDateBucket with a date filter to limit the range of results. Without filters, the chart would show everything from the earliest object in the vault to the most recent, possibly spanning decades. If that’s what you want to see, then that’s absolutely fine. However, most of the time, a filter like between @startOfYear and @endOfYear for an annual trend chart might be more appropriate.

2.4. groupByRange – one value per admin-defined range

This one is also very similar to the last two groupings. With the main difference that this is the only one that allows you to define the exact range.

2.4.1 The groupByRange itself

If you have a certain property that has a high cardinality, it might be difficult to display it with a groupByProperty. Let’s take for example the “Amount” property from before. When you want to apply math on it (sum/avg/min/max/median), then that’s fine because it only returns one value. But if you want to see the revenue themselves, without a prior grouping on something else (e.g. above we first group by “Customer”), then you would end-up with dozens/hundreds/thousands of groups? That’s where groupByRange shines, because you define the grouping (e.g. 0-1’000, 1’000-5’000, >=5’000).

You can use that grouping method with numeric / time / text / lookup properties. On the other hand, date / timestamp and boolean aren’t supported. The reason for that is simple: there is already groupByDateBucket for date / timestamp, and boolean can only have 2 values (Yes / No), so groupByProperty works just fine.

// Invoices by specific range
// 3 ranges: 0-1000, 1000-5000, 5000-10000
"aggregation": {
  "type": "groupByRange",
  "propertyName": "Amount",
  "boundaries": ["0", "1000", "5000", "10000"]
}

// Invoices by specific range, with open ranges
// 4 ranges: 0-1000, 1000-5000, 5000-10000, >=10000
"aggregation": {
  "type": "groupByRange",
  "propertyName": "Amount",
  "boundaries": ["*", "1000", "5000", "10000", "*"]
}

// Efficiency / speed of processing / duration of some actions / etc...
// 4 ranges: 0-30s, 30s-1min, 1min-2min, >=2min
"aggregation": {
  "type": "groupByRange",
  "propertyName": "Duration",
  "boundaries": ["*", "00:00:30", "00:01:00", "00:02:00", "*"]
}

// Customers by country name
// 3 ranges: A-F, F-M, >=M
// e.g. "France" in 2nd group, Switzerland in 3rd group
"aggregation": {
  "type": "groupByRange",
  "propertyName": "Country",
  "boundaries": ["A", "F", "M", "*"],
  "reducer": "count"
}

Again, the supported fields are fairly similar to the groupByProperty:

Field	Required	Description
*propertyName*	Yes	Property to group by (numeric / time / text / lookup only)
*boundaries*	Yes	Ordered array of range boundary values (at least 2 values, sorted ascending, with at least 1 non “*” value)
*reducer*	No	Same as *groupByProperty*
*reducerProperty*	Yes, when reducer != count	Same as *groupByProperty*
*includeEmptyResults*	No	Same as *groupByProperty, but “Yes”* will also show empty ranges
*seriesProperty*	No	Same as *groupByProperty*

2.4.2. How boundaries work

Each boundary value marks the inclusive lower bound of a range and the exclusive upper bound of the range below it, except for the last range, which is fully inclusive. For example, ["0", "1000", "5000", "10000"] creates three buckets: [0, 1000), [1000, 5000) and [5000, 10000]. This means that a value of “1000” will end-up on the 2nd bucket only. A value of “5000” or “10000” will end-up on the 3rd bucket.

In addition, as you can see in the examples above, you can use a wildcard ("*") on either end: a leading "*" creates a bucket for everything below the first boundary, and a trailing "*" creates a bucket for everything at or above the last boundary.

2.5. list – one row per object

list are pretty similar to search results from M-Files, in the sense that it will just list something, with pre-defined columns. When using this aggregation type, the reducers have no effect, because it only lists objects but do not apply any modifications / computing on them.

// List all contracts with 3 specific columns
"aggregation": {
  "type": "list",
  "displayProperties": ["Agreement type", "Effective through", "Responsible person"]
}

Field	Required	Description
*displayProperties*	No, but highly recommended	Name of properties to include in the table, as columns, in addition to the object name

For simple lists, the objects will be pre-sorted alphabetically but the user is then able to re-sort them, by each of the columns displayed. When drillThroughEnabled is set to “Yes”, then table rows become clickable and allows navigation to the object in question. There is no modal/drill-through in this case, since the table list already display the target object (no grouping).

3. The six reducers

As mentioned, all five aggregation types, except list, accept a reducer. You probably understand them already, but just as a quick table:

Reducer	What it computes	reducerProperty types	Empty set returns
*count* (default)	Number of matching objects	n/a	0
*sum*	Total of the property’s values	Numeric or time	*null* (rendered as –)
*avg*	Arithmetic mean	Numeric or time	*null* (rendered as –)
*median*	Middle value	Numeric or time	*null* (rendered as –)
*min*	Smallest value	Numeric or date / timestamp / time	*null* (rendered as –)
*max*	Largest value	Numeric or date / timestamp / time	*null* (rendered as –)

3.1. The date/time-valued reducer rule

When min or max is applied to a date, timestamp, or time property, the reducer returns the value and it will be formatted for end-users based on their localization / regional settings: e.g. DD/MM/YYYY for dates, DD/MM/YYYY HH:mm for timestamps (minute precision), and HH:mm:ss for times.

When sum, avg or median is applied to a time property, it allows you to compute a total (or average/median) duration for a certain activity. This might be useful if you have time-constraints.

Three widgets can render date/time values: kpiNumber (a single big number), gauge (switch to date mode, covered in Post 4a) and table (simple display in rows).

4. seriesProperty – the second dimension

The series details were already covered a bit in the posts 4b and 4c. But, this is the feature that turns a single-series widget into a multi-actor comparison. When you set a seriesProperty on a groupBy aggregation, the engine will automatically produce one series per distinct value of the series property. This allows a two-dimensional comparison.

The behavior per widget type is the following:

line / area – one colored line per series, with an auto-generated legend.
bar – one colored sub-bar per series (display.barLayout controls stacked vs grouped).
donut – a multi-mini-pie grid, one donut per series, with a shared legend.
table – a cross-tab pivot, one column per series.

As a reminder, kpiNumber and gauge ignore seriesProperty, since they only display single-values.

4.1. Use low-cardinality series only

I repeated this in every widget post but it is worth restating: seriesProperty should be low-cardinality. A property with five distinct values produces a readable multi-series chart. A property with two hundred values produces a colored mess. In that case, you might want to use a groupByRange, to reduce the amount of groups and therefore the series.

4.2. includeEmptyResults in multi-series

In single-series mode, includeEmptyResults: “Yes” fills empty time buckets with zero values. In multi-series mode, it does the same across both dimensions: every series gets a zero in any bucket where it has no data. This avoids broken lines and visually confusing gaps.

The behavior also adds a (none) bucket for objects whose primary group property has no value, and a (none) series for objects whose series property has no value. These are appended at the end so it doesn’t disrupt the “main story”.

4.3. Multi-select lookups in seriesProperty (and propertyName)

A subtle but important case: as you probably know, M-Files has a multi-select lookup property type. These allow the selection of multiple pre-defined values. Because of that, objects with multiple values will end-up in multiple buckets/groups, with the “counted once per value” rule.

Example: a customer with office locations in Geneva, Zurich and Berlin could appear three times, once for a Geneva bucket, once for a Zurich bucket and finally once for a Berlin bucket. The total of all bucket counts can therefore exceed the total number of objects when some objects have multiple values.

This is the correct and expected behavior for multi-select lookups. The alternative (counting each object only in its first value) would silently hide the multi-value relationships that often matter most.

5. displayProperties on drill-through

I mentioned displayProperties in section 2.5 above, as optional column to be added for list aggregations. The same field has a second role for all other aggregation type (summary, groupByProperty, groupByDateBucket, groupByRange): it controls the columns in the drill-through modal.

"aggregation": {
  "type": "groupByProperty",
  "propertyName": "Agreement type",
  "displayProperties": ["Effective through", "Responsible person"]
}

When the user clicks a donut slice (or a bar, or a row in a count table), the drill-through modal shows one row per object in that group, with the object name plus all the optional columns defined in the displayProperties value (e.g. Effective through and Responsible person).

6. The compatibility cheat sheet

I already put this table at the end of Post 4c, but it is short enough, so:

Widget type	summary	groupByProperty	groupByDateBucket	groupByRange	list	seriesProperty
*kpiNumber*	yes	–	–	–	–	ignored
*gauge*	yes	–	–	–	–	ignored
*donut*	–	yes	yes	yes	–	multi-mini-pie
*bar*	–	yes	yes	yes	–	stacked or grouped
*line*	–	yes	yes	yes	–	multi-series lines
*area*	–	yes	yes	yes	–	multi-series areas
*table*	yes	yes	yes	yes	yes	cross-tab pivot

Combine this with the reducer table in section 3 and you have the full answer to “can I use aggregation X with widget Y, and with reducer Z on property type T”. In any case, the Visual Designer and the validation process will prevent you to make any mistake.

7. What this gives you

The query side (Post 5) plus the aggregation side (this post) together cover everything the engine supports. If that wasn’t the case before, you should now be able to read and understand every line from any of the previous JSON example.

The combinations are richer than they look at first. A groupByDateBucket on monthly invoices with a sum reducer on Amount and a seriesProperty on Customer, rendered as a line widget with includeEmptyResults: “Yes”, gives a multi-customer revenue trend that takes about 15 lines of JSON. A groupByProperty on Agreement type with a seriesProperty on Workflow state rendered as a table gives a cross-tab pivot with drill-through on every cell.

The possibilities aren’t endless, obviously, but good luck if you would like to try them all… Last time I checked, you could create several million different widget combinations. An important part of that would trigger warnings or errors for non-supported cases, but still a considerable scale.

So far, I covered the end-user part, the JSON, the widgets and now the queries. These are the building blocks of dashboards. The remaining posts of the series will cover:

Post 7 – The Admin tab: the Visual Designer and the JSON editor, how they relate, the two-stage validator, import / export.
Post 8 – Access control, performance, and the interactive features (auto-refresh, drill-through, exports, favorites, shareable links).
Posts 9a and 9b – The same Contracts dashboard built two ways: visually first, then in JSON.

Want to know more about this Business Dashboard? Contact us and we will be happy to showcase it on M-Files.

Post Views: 30

M-Files BD – Aggregations, reducers and series

1. The two-dimensional model

2. The five aggregation types

2.1. summary – one value over everything

2.2. groupByProperty – one value per group

2.3. groupByDateBucket – one value per time period

2.4. groupByRange – one value per admin-defined range

2.4.1 The groupByRange itself

2.4.2. How boundaries work

2.5. list – one row per object

3. The six reducers

3.1. The date/time-valued reducer rule

4. seriesProperty – the second dimension

4.1. Use low-cardinality series only

4.2. includeEmptyResults in multi-series

4.3. Multi-select lookups in seriesProperty (and propertyName)

5. displayProperties on drill-through

6. The compatibility cheat sheet

7. What this gives you

Morgan Patou

Leave a Reply:

Related blog articles

1. The two-dimensional model

2. The five aggregation types

2.1. summary – one value over everything

2.2. groupByProperty – one value per group

2.3. groupByDateBucket – one value per time period

2.4. groupByRange – one value per admin-defined range

2.4.1 The groupByRange itself

2.4.2. How boundaries work

2.5. list – one row per object

3. The six reducers

3.1. The date/time-valued reducer rule

4. seriesProperty – the second dimension

4.1. Use low-cardinality series only

4.2. includeEmptyResults in multi-series

4.3. Multi-select lookups in seriesProperty (and propertyName)

5. displayProperties on drill-through

6. The compatibility cheat sheet

7. What this gives you

Morgan Patou

Leave a Reply:

Related blog articles

M-Files BD – Queries: objectType, class, filters, date tokens

A project that looked successful. But…

M-Files BD – Distribution and tabular widgets: donut, bar, table

M-Files BD – Trend widgets: line and area