Multilingual search

Pagefind supports multilingual sites out of the box, with zero configuration.

When indexing, Pagefind will look for a lang attribute on your html element. Indexing will then run independently for each detected language. When Pagefind initializes in the browser it will check the same lang attribute and load the appropriate index.

If you load Pagefind search on a page tagged as <html lang="pt-br">, you will automatically search only the pages on the site with the same language. Pagefind will also adapt any stemming algorithms to the target language if supported. This applies to both the Pagefind JS API and the Pagefind UI.

The Pagefind UI itself is translated into a range of languages, and will adapt automatically to the page language if possible.

Setting the force language option when indexing will opt out of this feature and create one index for the site as a whole.

#Language support

Pagefind will work automatically for any language. Explicit language support improves the quality of search results and the Pagefind UI.

If word stemming is unsupported, search results won’t match across root words. If UI translations are unsupported, the Pagefind UI will be shown in English.

Language UI Translations Word Stemming
Afrikaans — af
Arabic — ar
Armenian — hy
Basque — eu
Bengali — bn
Catalan — ca
Chinese — zh See below
Croatian — hr
Danish — da
Dutch — nl
English — en
Finnish — fi
French — fr
Galician — gl
German — de
Greek — el
Hindi — hi
Hungarian — hu
Indonesian — id
Irish — ga
Italian — it
Japanese — ja See below
Lithuanian — lt
Māori — mi
Nepali — ne
Norwegian — no
Polish — pl
Portuguese — pt
Romanian — ro
Russian — ru
Serbian — sr
Spanish — es
Swedish — sv
Tamil — ta
Turkish — tr
Vietnamese — vi
Yiddish — yi

Feel free to open an issue if there’s a language you would like better support for, or contribute a translation for Pagefind UI in your language.

#Specialized languages

This section currently applies to Chinese and Japanese languages. Specialized languages are only supported in Pagefind’s extended release, which is the default when running npx pagefind.

Currently when indexing, Pagefind does not support stemming for specialized languages, but does support segmentation for words not separated by whitespace.

Pagefind does not yet support segmentation of the search query, so searching in the browser requires that words in the search query are separated by whitespace.

In practice, this means that on a page tagged as a zh- language, 每個月都 will be indexed as the words 每個, , and .

When searching in the browser, searching for 每個, , or individually will work. Additionally, searching 每個 月 都 will return results containing each word in any order, and searching "每個 月 都" in quotes will match 每個月都 exactly.

Searching for 每個月都 will return zero results, as Pagefind is not able to segment it into words in the browser. Work to improve this is underway and will hopefully remove this limitation in the future.