Tailored indexing for Umbraco Search

This week I had the privilege of being part of the latest Umbraco DevRel Deep Dive š„³
It ended up wonderfully geekish - right up my alley.
As part of the talk with Lotte and Seb, I showed off a fair few code snippets for tweaking and tailoring content indexing in Umbraco Search. I figured itād be appropriate to share those snippets here, alongside a little more context to the why and when.
Letās get our geek on and explore some of the extension points in Umbraco Search š¤
Changing the default property indexing
Some property editors store numbers, some store texts. Some texts are meant for full text searching, some are meant for keyword filtering.
Umbraco Search aims to understand this kind of intent for each property editor type. To this end, it employs a concept called property value handlers. These are essentially property value converters for Search.
If you have custom property editors, you can help Search understand the intent of the stored data by creating your own property value handlers.
Whatās more; if the core property value handlers donāt fulfill your search (indexing) requirements, they can easily be replaced š
For example: Search assumes that the āfixed value editorsā from core (checkbox list, dropdown and radio button list) should be used for filtering, so the picked value(s) are indexed as keywords. This means that the picked value(s) wonāt be available for free text search.
Now, say that I wanted radio button list properties to be both filterable (as keywords) and searchable. I could achieve that by implementing my own version of IPropertyValueHandler
- like this:
using Umbraco.Cms.Core;
using Umbraco.Cms.Core.Models;
using Umbraco.Cms.Search.Core.Models.Indexing;
using Umbraco.Cms.Search.Core.PropertyValueHandlers;
namespace Site.Demos;
public class MyRadioButtonListPropertyValueHandler : IPropertyValueHandler
{
// this property value handler can handle the radio button list from core
public bool CanHandle(string propertyEditorAlias)
=> propertyEditorAlias is Constants.PropertyEditors.Aliases.RadioButtonList;
public IEnumerable<IndexField> GetIndexFields(
IProperty property,
string? culture,
string? segment,
bool published,
IContentBase contentContext)
=> property.GetValue(culture, segment, published) is string { Length: > 0 } value
? [
new IndexField(
property.Alias,
new IndexValue
{
// index the value as both keyword for filtering
// and as text for full text searching
Keywords = [value],
Texts = [value]
},
culture,
segment)
]
: [];
}
Indexing additional fields
When indexing content, Search gathers two sets of data:
- The system fields required to make Search tick, and
- The fields for all contained properties, using the property value handler concept described above.
This is usually just fine. But sometimes you need more index data to power search.
Perhaps you have domain specific and/or contextual data, which isnāt represented in the content model. Or maybe you need to perform up-front calculations to make specific search queries more performant or even possible.
To help you do all this, Search features content indexers. These are invoked at content level whenever a piece of content is indexed.
Letās say you have modelled products as content in Umbraco, and that products are programmatically mapped to their respective categories - that is, the product categories are not part of the content model.
By implementing IContentIndexer
, you can index the category mapping for each product for subsequent querying:
using Umbraco.Cms.Core.Models;
using Umbraco.Cms.Search.Core.Models.Indexing;
using Umbraco.Cms.Search.Core.Services.ContentIndexing;
namespace Site.Demos;
public class MyProductContentIndexer : IContentIndexer
{
public async Task<IEnumerable<IndexField>> GetIndexFieldsAsync(
IContentBase content,
string?[] cultures,
bool published,
CancellationToken cancellationToken)
{
if (content.ContentType.Alias is not "product")
{
return [];
}
var categories = await GetCategories(content.Key);
return categories.Length > 0
? [
new IndexField(
"category",
new IndexValue
{
// index the product categories for filtering (faceting)
Keywords = categories
},
Culture: null,
Segment: null
)
]
: []
;
}
private async Task<string[]> GetCategories(Guid productId)
=> await Task.FromResult<string[]>(["implement", "your", "own"]);
}
Unlike property value handlers, content indexers must be registered explicitly:
using Umbraco.Cms.Core.Composing;
using Umbraco.Cms.Search.Core.Services.ContentIndexing;
namespace Site.Demos;
public class MySiteComposer : IComposer
{
public void Compose(IUmbracoBuilder builder)
=> builder.Services.AddTransient<IContentIndexer, MyProductContentIndexer>();
}
If everything else failsā¦
ā¦thereās a notification for that š
All jokes aside⦠Search fires the IndexingNotification
just before indexing content. You should consider this a last resort when all other options have been exhausted⦠but it does have itās merits in a pinch.
Manipulating property index data
You can alter all data going into the index with this notification.
For example, if your content model contains a property that must not be added to the published (public) content index, you can remove it - like this:
using Umbraco.Cms.Core.Events;
using Umbraco.Cms.Search.Core.Notifications;
using SearchConstants = Umbraco.Cms.Search.Core.Constants;
namespace Site.Demos;
public class MyIndexingNotificationHandler : INotificationHandler<IndexingNotification>
{
public void Handle(IndexingNotification notification)
{
// only proceed if this is a notification for the published content index
if (notification.IndexInfo.IndexAlias is not SearchConstants.IndexAliases.PublishedContent)
{
return;
}
// find the field that should be omitted from the index
var fieldToOmit = notification
.Fields
.FirstOrDefault(field => field.FieldName == "secretPropertyAlias");
if (fieldToOmit is null)
{
return;
}
// remove the field from the fields collection
notification.Fields = notification
.Fields
.Except([fieldToOmit])
.ToArray();
}
}
Discarding the entire content
The IndexingNotification
is a cancelable notification. This allows for cancelling the indexing of specific content altogether:
using Umbraco.Cms.Core.Events;
using Umbraco.Cms.Search.Core.Notifications;
using SearchConstants = Umbraco.Cms.Search.Core.Constants;
namespace Site.Demos;
public class MyIndexingNotificationHandler : INotificationHandler<IndexingNotification>
{
// the ID of the super secret content type that should never be in the index
private static readonly Guid SecretContentTypeId = Guid.Parse("6B730BA8-4560-4745-906C-FE08A2FF756C");
public void Handle(IndexingNotification notification)
{
// grab the content type ID from the fields collection
// - it is indexed a keyword value in the core "ContentTypeId" field
var contentTypeIdAsKeyword = notification
.Fields
.FirstOrDefault(field => field.FieldName is SearchConstants.FieldNames.ContentTypeId)?
.Value
.Keywords?
.FirstOrDefault()
?? string.Empty;
// is this the super secret content type?
if (Guid.TryParse(contentTypeIdAsKeyword, out var contentTypeId)
&& contentTypeId.Equals(SecretContentTypeId))
{
// yes - cancel the notification
notification.Cancel = true;
}
}
}
Thatās all, folks!
Yep. Those were the extension points I covered in my chat with Lotte and Seb. Potent stuff, if put to proper use šŖ
One thing is for sure: Umbraco Search is not meant ss a one-size-fits-all - nor should it be. Requirements will always differ from project to project, and Search definitely needs all the extension points, it can get.
I hope this sparked your interest and fueled your imagination āØ
Happy searching š