Journey into the Aggregation Process - Part 2

Friday, May 06, 2016 @ 02:30

By: Scott Gillis, Lead Consultant – After arriving in the reporting database, our journey continues, taking us to the end reports. The second leg of this journey is not a difficult or long one, as there are three quick stop overs before arriving in a chart or table.

Journey into the Aggregation Process Part Two - Image One

Stop 1: Dimension and Fact Tables

Unlike past versions of Sitecore (anything before 7.5) data collected and reporting where ran out of the same analytics database. Because this system was transactional first, the database was reasonably normalized allowing for straightforward finding and summarizing of data for a general developer.

The Sitecore Experience Platform has separated the transactional data from the reporting. The first leg of this journey talks about creating a denormalized database built specifically with reporting in mind.

The new reporting database is now built on dimension and fact tables. For the non-DBAs riding along, this means the aggregation has processed our data into common data warehouse modeling, allowing more and faster reporting to occur.

Sitecore has also provided twelve pre-built views, that combine and summarize data for use in reporting. When looking at this list, you can see that many of these are common initial questions that are asked of the data when first starting out.

Journey into the Aggregation Process Part Two - Image Two

Stop 2: Search Indexes

During the aggregation process, much of the data that is collected about a contact record (both known and anonymous) is placed into a search index. Depending on your installation, this might be Solr or Lucene.

I've included indexes in this leg of the journey, as they are used by Sitecore in a couple of the reporting screens. It is a smart place to begin building if you plan to report or create an application on top of the Experience Profile.

The indexed information is used to populate the list of recent contacts that is displayed when the Experience Profile is opened. The Search bar and general data shown are all pulled from the index.

Segmentation rules used in the Email Experience Manager and List Manager perform the checks against the search index as well.

Stop 3: Reporting Service

The final stop into Experience Analytics (and other reports), is the Reporting Service API. Sitecore has abstracted the retrieval of data from the reporting database into a JSON based REST service. This does not mean we cannot create custom reports that pull data through other mechanisms (such as directly from SQL), but it does provide a nice standard to base creating reports on.

With a little digging, I found that the URL for accessing the reporting API is:
https://<my site>/sitecore/api/ao/aggregates/{site}/{segments}/{keys}
This route is mapped in Sitecore.ExperienceAnalytics.Api.Http.Configuration.RouteMapper, and found in the assembly Sitecore.ExperienceAnalytics.dll. The controller that manages requests is Sitecore.ExperienceAnalytics.Api.Http.AnalyticsDataController, also found in the Sitecore.ExperienceAnalytics.dll assembly.

With this knowledge, we can breakdown the basics of what happens from a call used by the "Online Interactions by visits and value per visit" line chart on the opening dashboard screen.

Journey into the Aggregation Process Part Two - Image Three

[The next post in the series will look at the tables and charts that exist by default and how to create our own. Right now, the focus will be on the specifics of how the data is captured.]

The data is retrieved via HTTP GET, to the following URL (break down follows):

URL Segment Description The Site
/sitecore/api/ao/aggregates Mapped API Route
/all Corresponds to the mapped routes 'Site' parameter. The Site parameter provides filtering information as to whether the data returned should be all sites being tracked (the default) or another site defined in the web.config's sites node definition, such as "website".
/786FBA3A4573445EA74504E3CA5E48C1 Corresponds to the mapped routes 'segments' parameter. This is the ID to a segment element defined in the master database at:
/sitecore/system/Marketing Control Panel/Experience Analytics/Dimensions/<Specific Dimension>/<Segment Item>
Segments provide the cross section (filter) of what type of visitors should be included in the counts for the data shown
/all Corresponds to the mapped routes 'keys' parameter. Currently, this can be one of two values; All or sum, depending on the requirements of the query
?&dateGrouping=by-auto&&dateFrom=04-02-2016&dateTo=04-03-2016&keyGrouping=collapsed Beyond the controller required fields, there is a set list of available filter options that can be passed as query string parameters.
These options are defined/limited by:
Available query strings are:
  • dateFrom
  • dateTo
  • dateGrouping
  • keyGrouping
  • keyOrderBy
  • keyTop
  • keySkip
  • keysFromParent

Sitecore.ExperienceAnalytics.Api.ReportDateService::ExecuteQuerey() is eventually called with the parsed URL, including query string parameters. From within this method, there is a number of helper methods that are then used to build a SQL query that is executed against the reporting database. The helper methods can be found in the Sitecore.ExperienceAnalytics.Api.Query.QueryBuilder namespace.
The basic process looks like this:

Journey into the Aggregation Process Part Two - Image Four

Sitecore provides JavaScript that takes the returned JSON and places it into the chart or table you've selected if you have added custom dashboards to Experience Analytics. If you are spinning your own reporting page, you'll need to leverage some your own custom JS or server side code to parse and display the data.

Because the actual querying of data is abstracted, the actual handling of report query calls can be placed on a server with the role 'Reporting Services'. This role can be installed with others, such as the Processing or CM roles, but if your plans involve a lot of custom reporting queries, it would be highly advised to scale this role to an individual server for best data return rates.

Arriving into the Station

And with our heads pondering the possibilities of custom reporting applications, we roll into the station of Experience Analytics and the other data dashboards Sitecore has built for us. Understanding where and how the collected data is managed provides the power to begin asking specific questions around how to allow visitors to get the most engagement out of each visit.

Reporting Service Bibliography

  1. Provides a glossary of different server roles and services:
  2. Sitecore diagram showing a basic reporting architecture:
  3. Explains the files to enable/disabled when configuring the Reporting Services Role:

As always, feel free to tweet me questions or comments @thecodeattic or on Sitecore Slack Community as @gillissm.



Scott Gillis, Lead Consultant at Paragon and 2017 Sitecore MVP, has been working with Sitecore for several years. He has a deep passion for helping clients leverage their content and data into powerful new capabilities in Sitecore and has produced successful outcomes as the technical lead on numerous, complex implementations. Recently, Scott has been focusing on helping these clients take advantage of the wealth of data collected by Sitecore Experience Analytics.