You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

What is Lucene?

Lucene is a high-performance, full-featured, open source text search engine. (for more information see www.lucene.net). It is the engine used for many commercially available search solutions.

Originally developed in Java, Lucene has been ported to Microsoft's .NET platform through a project called Lucene.NET.

Lucene's features include:

  • Powerful, Accurate and Efficient Search Algorithms
  • ranked searching – best results returned first (otherwise known as ranking by relevance)
  • many powerful query types: phrase queries, wildcard queries, proximity queries, range queries and more
  • fielded searching (e.g., code, description, brand)
  • Spelling correction
  • relevance "boosting" allowing enhancement of relevance rank based on external factors (i.e. product popularity from web site history, product brand, etc)

How is Lucene Used in Customer Self Service?

Many search solutions based on Lucene are run externally from the source web site. In other words, web site administrators need to provide a data feed to the external service so that catalogue data can be indexed. Then integration is required to be developed to allow switching between the main web site and the third party search web site.

With Customer Self Service, Lucene.NET has been implemented directly into the application codebase.

This provides the following advantages:

  • Fast indexing and an up-to-date search index due to the source data being local to the search engine.
  • Search queries executed locally on the web site server, providing faster and more reliable searching
  • No third party Saas or License fees
  • Ability to apply additional business logic, such as filtering search results based on the current user's stock security (i.e. applying item restrictions)

Features of the Customer Self Service Lucene Implementation

  • Ability to define search index fields based on one or more fields from the product or category tables
  • Ability to define index field "boost" values, giving higher weight to search terms in particular fields if required.
  • Ability to boost product relevance based on product values, such as the number of times an item has been clicked-through, or the number of times an item has been sold
  • Define "Boost Words" allowing products containing certain words to be given higher relevance (e.g. promote your brand above others)
  • Define Search Query Translations allowing automatic or "did you mean?" (manual) replacement of user's search queries (e.g. user searched for "rubber", system responds with "did you mean 'eraser'?")
  • Spelling correction database generated from the product text which can be used for "did you mean" search suggestions.
  • Optional use of word stemming for indexing and search queries - (i.e. laminate/laminating/laminator/laminated are all equivalent when stemmed)
  • Define Search Index Translations allowing items to be found based on alternate text.
  • Define "Trim Words" allowing consistent treatment of suffixes (e.g allowing "100mm" and "100 mm" to be used interchangeably)
  • Search Tools allowing direct query on the Search Index including display of boost values, relevance scores and score explanations to assist with debugging search query results
  • Configurable settings allowing features of the search engine to be turned on or off.
  • Catalogue Browse based on relevance - listing products based on their score (including boost values)

 

  • No labels