Sitecore and Solr: searching empty fields

Posted 29 June 2017 12:00 AM by Vicent Galiana, Sitecore Solutions Architect @ ClearPeople

By default, when we index our content in solr (same for Lucene) Sitecore creates a document per item, and store the value of each (Sitecore) field in its related field in Solr. This is just the basic scenario, and can (And will) become more complex as we introduce other scenarios, but it's enough for our matter.

In Solr and Lucene, each document is different. A document can have multiple fields, the same field can be stored several times and a field can not exist for a given document.

By default, if our Sitecore field is empty, Sitecore won't create the field in Solr, making it difficult (if possible at all) to search for things like "where myfield is empty". The usual best practice for handling this scenario is to store an "empty" value in our Solr field, to be able to look for it.

Thanks to this post I learned that now Sitecore has implemented this best practice, so we can specify the value to be inserted in the index for null values or empty strings, which is great but...

The implementation of the field reader for multivalue fields, prevents this mechanism to work. This field reader always returns a list, hence the document builder never uses any of the empty/null values assigned in the configuration.

With a few lines of code, o replaced the default reader with my own implementation where I check for empty lists before returning it to the document builder. If the list is empty, I return an empty string instead, "forcing" the document builder to use the configured emptyValue for each field.

This is my field reader:

public class MultiListFieldReader : global::Sitecore.ContentSearch.FieldReaders.FieldReader
    {
        public MultiListFieldReader()
        {
        }

        public override object GetFieldValue(IIndexableDataField indexableField)
        {            
            List<string> strs = base.GetFieldValue(indexableField) as List<string>;            
            if (strs == null || !strs.Any())
                return string.Empty;
            else
                return strs;
            
        }
    }
And this is how we can replace the default reader for multivalue fields
	<fieldReaders type="Sitecore.ContentSearch.FieldReaders.FieldReaderMap, Sitecore.ContentSearch">
            <mapFieldByTypeName hint="raw:AddFieldReaderByFieldTypeName">
              <fieldReader fieldTypeName="checklist|multilist|treelist|treelistex|tree list"
 fieldNameFormat="{0}" fieldReaderType="ClearPeople.Search.Crawler.FieldReader.MultiListFieldReader, 
ClearPeople.Search.Crawler" 
patch:instead="fieldReader[@fieldTypeName='checklist|multilist|treelist|treelistex|tree list']"/>            
            </mapFieldByTypeName>
          </fieldReaders>

And finally, as per the previous blog posts, this is how we can configure the value for empty fields (Please not than even when we only need the emptyString attribute, we need to specify the nullValue, for Sitecore to call the right overload of the configuration method:

	<fieldMap>
             <fieldNames hint="raw:AddFieldByFieldName">
              <fieldType fieldName="excludefromsearch"     returnType="string" />
              <field returnType="stringCollection" fieldName="sectors" nullValue="none" emptyString="none" />
            </fieldNames>
          </fieldMap>

I have raised a ticket with Sitecore to verify this behaviour and confirm a valid solution, in the meantime, use this with care and always do your test before going live with it.

Share:

Add your comment

 
 

 

Archive

Tagcloud

Digital Transformation employee engagement staff satisfaction productivity Microsoft Teams Office 365 Yammer cms content management system agile GDPR Microsoft Graph collaboration Microsoft sharepoint 2016 upgrade migration SharePoint Online 2016 Tech Trends Digital Disruption Context marketing marketing SharePoint 2010 SharePoint 2013 TFS Git security kentico Analytics intranet jquery QA Quality Assurance testing digital workspace content management websites Sitecore sitecore marketplace sitecore module cloud Microsoft Cloud Storage digital strategy technical consulting sitecore modules Experience database Sitecore 7 Sitecore 8 support account management customer experience Data Storage windows azure cms integration front end front end development prototype Cloud Storage StorSimple Front-end Development Layout SharePoint 2013 colour palette UI design website design log viewer sitecore cms website Azure big data business-critical sharepoint accessibility android apple chrome clear people clearpeople debug emulator ios mobile testing opera resize adobe desktop flash ie10 internet explorer 10 metro windows 8 bcsp SharePoint Advanced System Reporter reporting framework ControlMode form control master page placeholder publishing console SharePoint 2007 SharePoint error search search results search values software testing testing scenario audit content information architecture retention schedules PowerShell QuickLaunch scripts SharePoint server 2010 business solutions metalogix replication replicator storagepoint stena technet UK Technet picture library slideshow web part RTM released to manufacturing caml caml query MOSS 2007 query infopath