Customize Pagefind's result ranking

Pagefind’s default search algorithm is a great choice for most circumstances, but some datasets might be improved by changing the way results are ranked.

A good example is sites with a mix of long and short pages, where the long pages tend to be the preferred result. In this case, tweaking the pageLength and/or termFrequency parameters can improve the search relevance for the specific content.

Ranking parameters are configured within the ranking option passed to Pagefind, which can optionally contain any or all of the available parameters.

#Configuring ranking parameters

Ranking parameters can be passed to the JavaScript API via pagefind.options():

const pagefind = await import("/pagefind/pagefind.js");
await pagefind.options({
    ranking: {
        // optional parameters, e.g:
        termFrequency: 1.0,
    }
});

Ranking parameters can be passed to the Default UI during initialization:

new PagefindUI({
    element: "#search",
    ranking: {
        // optional parameters, e.g:
        pageLength: 0.75
    }
});

#Configuring Term Frequency

await pagefind.options({
    ranking: {
        termFrequency: 1.0 // default value
    }
});

termFrequency changes the ranking balance between frequency of the term relative to document length, versus weighted term count.

As an example, if we were querying search in the sentence “Pagefind is a search tool that can search websites”, the term frequency of search is 0.22 (2 / 9 words), while the weighted term count of search is 2. This latter number would also include any content with custom weights.

Reducing the termFrequency parameter is a good way to boost longer documents in your search results, as they no longer get penalized for having a low term frequency, and instead get promoted for having many instances of the search term.

#Configuring Term Similarity

await pagefind.options({
    ranking: {
        termSimilarity: 1.0 // default value
    }
});

termSimilarity changes the ranking based on similarity of terms to the search query. Currently this only takes the length of the term into account.

Increasing this number means pages rank higher when they contain words very close to the query, e.g. if searching for part, a result of party will boost a page higher than one containing partition.

The minimum value is 0.0, where party and partition would be viewed equally.

Increasing the termSimilarity parameter is a good way to suppress pages that are ranking well for long extensions of search terms.

#Configuring Page Length

await pagefind.options({
    ranking: {
        pageLength: 0.75 // default value
    }
});

pageLength changes the way ranking compares page lengths with the average page lengths on your site.

Decreasing the pageLength parameter is a good way to suppress very short pages that are undesirably ranking higher than longer pages.

#Configuring Term Saturation

await pagefind.options({
    ranking: {
        termSaturation: 1.4 // default value
    }
});

termSaturation controls how quickly a term “saturates” on a page. Once a term has appeared on a page many times, further appearances have a reduced impact on the page rank.

Decreasing the termSaturation parameter is a good way to suppress pages that are ranking well due to an extremely high number of search terms existing in their content.