Boost newer documents in Sitecore 7 & Solr 4

Thursday, September 18, 2014 @ 12:02

Originally posted at http://www.sitecoreblogger.com/

By: Matt Gartman, Senior Developer/Systems Engineer – Sitecore 7 search backed by Apache Solr make for a very powerful and flexible search solution.  Unfortunately Sitecore's search abstraction layer was built to have consistency when using any underlying search technology.  This results in some unique features that are available in Solr not being included in the Sitecore 7 ContentSearch API.

If you are looking to boost your search results by a publication date (or any date field for that matter) the general recommendation from a Solr perspective is to utilize a function query as outlined in the SolrRelevancyFAQ.  Unfortunately Sitecore's API does not offer any way out of the box to generate a function query.  Digging into Solr.NET (which Sitecore's Solr implementation makes use) you are able to utilize LocalParms to attach a function query as part of your search request.  Since this is specific to Solr.NET and not something available in Lucene it is not something that has made it into Sitecore's ContentSearch API.

In order to extend Sitecore's ContentSearch to get this type of feature proved challenging and required a good deal of the ContentSearch code to be re-written.  Knowing that the supportability of such customization would be difficult to maintain between Sitecore versions I opted to look for a different approach.

Since Sitecore has limited options to manipulate what is actually be sent over to Solr I discovered the _val_ hook that Solr provides as part of FunctionQuery.  This will allow us to embed a function query inside of the actual search query.  We will be able to utilize this as a field in Sitecore so that we won't require any customization to the ContentSearch code.

Creating the Search Model

You will need to create a search model that utilizes Sitecore.ContentSearch.SearchTypes.SearchResultsItem as its base class.  It isn't necessary to use the base class but it will get you all of the standard Sitecore fields already defined.

Code:

Update Solr Schema

You will need to manually define "_val_" in your Schema.xml so that Sitecore does not attempt to make it a dynamic field.  To do this we will add the _val_ field to the <fields> node in your Solr Core's schema.xml.  Without doing this Solor will pick this up as a string field and append the _s suffix.  Please remember you will either need to re-load this Solr Core or restart Solr so it picks up the new Schema.xml.

schema.xml
<fields>
<field name="_val_" type="string" />
...snip...
</fields>

Date Boost Search Query

Now that your model is setup you just need to define the DateBoost Predicate.  This will be added to your GetQueryable Linq statement.  I am utilizing Sitecore's Predicate Builder to make the code more readable.

Code:

The date boost function query is utilizing the recommended function from the SolrRelevancyFAQ which is recip(ms(NOW, {FieldName}), 3.16e-11, 1, 1) where {FieldName} is replaced with your Sitecore Field name.  Please note you need to use the full dynamic field name that Solr knows about here which likely ends in a _tdt if you are accessing one of Sitecore's standard Date/Time fields.  The recip function can be adjusted based on how aggressive you wish to have the date boosting.  How much boosting you will need will depend on what other fields you are searching on, other boosting happening on the fields or document, and what your requirements are for how much impact document date has on the score it receives.

Verification

You will now be able to execute that search and utilize Sitecore's Search logs to view the query that is being sent over to Solr.  You will see that the _val_ hook is being added as expected and you should see that documents that contain the letter "A" that have more recent Publication Dates (a custom field defined on our template) will have a higher score then older documents.  This simple search example works well for our boosting since our scores are normalized around 1.  The more complex your search query the more you may have to adjust the recip() function to get the desired results.

Some additional information regarding Sitecore and search boosting can be found at John West's blog: Sitecore-7-Six-Types-of-Search-Boosting