Background
An Item Crawler crawls the child nodes within a given parent node and determines if any of the items are fit to be added to a Lucene index. Let's start by looking at a configuration file that defines all the indexes in the master database:Sitecore.ContentSearch.Lucene.Indexes.Sharded.Master.config
All the indexes defined in this file used the standard crawler:
This is the built-in crawler for general purposes. It includes methods like "IsExcludedFromIndex" to determine if an item should be added to the index or not.
Requirements
Not let's say you have a "/sitecore/content" node that has lots and lots of child nodes, in the hundreds. This is a likely scenario for single-instance, multi-site solutions for companies that have hundreds of sub-brands. Now, if we index the "/sitecore/content" node, the resulting index could contain a lot of items that we don't care about. The standard crawler would index all the nodes and generate a huge index. What if we only want to index items that are based on a specific template? We would have to built a custom crawler to do that. The resulting crawler would be something like this:Step-by-Step
Let's create a new class that inherits from the standard crawler:If we inspect the code for the standard crawler, everything is fine except for the "IsExcludedFromIndex" method. We can override this method to do what we want and to exclude items based on certain templates here.
No comments:
Post a Comment