diff --git a/_analyzers/index.md b/_analyzers/index.md index 95f97ec8ce..9b999e5c3d 100644 --- a/_analyzers/index.md +++ b/_analyzers/index.md @@ -170,6 +170,28 @@ The response provides information about the analyzers for each field: } ``` +## Normalizers +Tokenization divides text into individual terms, but it does not address variations in token forms. Normalization resolves these issues by converting tokens into a standard format. This ensures that similar terms are matched appropriately, even if they are not identical. + +### Normalization techniques + +The following normalization techniques can help address variations in token forms: +1. **Case normalization**: Converts all tokens to lowercase to ensure case-insensitive matching. For example, "Hello" is normalized to "hello". + +2. **Stemming**: Reduces words to their root form. For instance, "cars" is stemmed to "car", and "running" is normalized to "run". + +3. **Synonym handling:** Treats synonyms as equivalent. For example, "jogging" and "running" can be indexed under a common term, such as "run". + +### Normalization + +A search for `Hello` will match documents containing `hello` because of case normalization. + +A search for `cars` will also match documents containing `car` because of stemming. + +A query for `running` can retrieve documents containing `jogging` using synonym handling. + +Normalization ensures that searches are not limited to exact term matches, allowing for more relevant results. For instance, a search for `Cars running` can be normalized to match `car run`. + ## Next steps -- Learn more about specifying [index analyzers]({{site.url}}{{site.baseurl}}/analyzers/index-analyzers/) and [search analyzers]({{site.url}}{{site.baseurl}}/analyzers/search-analyzers/). \ No newline at end of file +- Learn more about specifying [index analyzers]({{site.url}}{{site.baseurl}}/analyzers/index-analyzers/) and [search analyzers]({{site.url}}{{site.baseurl}}/analyzers/search-analyzers/).