Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add query_by_tokens option in Neural Sparse Search #7040

Merged
38 changes: 35 additions & 3 deletions _query-dsl/specialized/neural-sparse.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,11 +10,12 @@
Introduced 2.11
{: .label .label-purple }

Use the `neural_sparse` query for vector field search in [neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/).
Use the `neural_sparse` query for vector field search in [neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/). The query can use either raw text or sparse vector tokens.

## Request fields

Include the following request fields in the `neural_sparse` query:
### Example: Query by raw text

```json
"neural_sparse": {
Expand All @@ -24,16 +25,26 @@
}
}
```
### Example: Query by sparse vector
```json
"neural_sparse": {
"<vector_field>": {
"query_tokens": "<query_tokens>"
}
}
```

The top-level `vector_field` specifies the vector field against which to run a search query. The following table lists the other `neural_sparse` query fields.

Field | Data type | Required/Optional | Description
:--- | :--- | :---
`query_text` | String | Required | The query text from which to generate vector embeddings.
`model_id` | String | Required | The ID of the sparse encoding model or tokenizer model that will be used to generate vector embeddings from the query text. The model must be deployed in OpenSearch before it can be used in sparse neural search. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/using-ml-models/) and [Neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/).
`query_text` | String | Optional | The query text from which to generate sparse vector embeddings.
`model_id` | String | Optional | The ID of the sparse encoding model or tokenizer model that will be used to generate vector embeddings from the query text. The model must be deployed in OpenSearch before it can be used in sparse neural search. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/using-ml-models/) and [Neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/). For information on setting a default model ID in a neural sparse query, see [`neural_query_enricher`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/neural-query-enricher/).

Check failure on line 42 in _query-dsl/specialized/neural-sparse.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.SubstitutionsError] Use 'for information about' instead of 'For information on'. Raw Output: {"message": "[OpenSearch.SubstitutionsError] Use 'for information about' instead of 'For information on'.", "location": {"path": "_query-dsl/specialized/neural-sparse.md", "range": {"start": {"line": 42, "column": 472}}}, "severity": "ERROR"}
`query_tokens` | Map<String, Float> | Optional | The query tokens, sometimes referred to as sparse vector embeddings. Similarly to dense semantic retrieval, you can use raw sparse vectors generated by neural models or tokenizers to perform a semantic search query. Use either the `query_text` option for raw field vectors or the `query_tokens` option for sparse vectors. Must be provided in order for the `neural_sparse` query to operate.
`max_token_score` | Float | Optional | (Deprecated) The theoretical upper bound of the score for all tokens in the vocabulary (required for performance optimization). For OpenSearch-provided [pretrained sparse embedding models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/pretrained-models/#sparse-encoding-models), we recommend setting `max_token_score` to 2 for `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1` and to 3.5 for `amazon/neural-sparse/opensearch-neural-sparse-encoding-v1`. This field has been deprecated as of OpenSearch 2.12.

#### Example request
**Query by raw text**

```json
GET my-nlp-index/_search
Expand All @@ -48,4 +59,25 @@
}
}
```
**Query by sparse vector**

```json
GET my-nlp-index/_search
{
"query": {
"neural_sparse": {
"passage_embedding": {
"query_tokens": {
"hi" : 4.338913,
"planets" : 2.7755864,
"planet" : 5.0969057,
"mars" : 1.7405145,
"earth" : 2.6087382,
"hello" : 3.3210192
}
}
}
}
}
```
{% include copy-curl.html %}
29 changes: 25 additions & 4 deletions _search-plugins/neural-sparse-search.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ Introduced 2.11
When selecting a model, choose one of the following options:

- Use a sparse encoding model at both ingestion time and search time (high performance, relatively high latency).
- Use a sparse encoding model at ingestion time and a tokenizer model at search time (low performance, relatively low latency).
- Use a sparse encoding model at ingestion time and a tokenizer at search time for relatively low performance and low latency. The tokenism doesn't conduct model inference, so you can deploy and invoke a tokenizer using the ML Commons Model API for a more consistent experience.

**PREREQUISITE**<br>
Before using neural sparse search, make sure to set up a [pretrained sparse embedding model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/pretrained-models/#sparse-encoding-models) or your own sparse embedding model. For more information, see [Choosing a model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/integrating-ml-models/#choosing-a-model).
Expand All @@ -29,7 +29,7 @@ To use neural sparse search, follow these steps:
1. [Create an ingest pipeline](#step-1-create-an-ingest-pipeline).
1. [Create an index for ingestion](#step-2-create-an-index-for-ingestion).
1. [Ingest documents into the index](#step-3-ingest-documents-into-the-index).
1. [Search the index using neural search](#step-4-search-the-index-using-neural-search).
1. [Search the index using neural search](#step-4-search-the-index-using-neural-sparse-search).

## Step 1: Create an ingest pipeline

Expand Down Expand Up @@ -144,11 +144,11 @@ PUT /my-nlp-index/_doc/2

Before the document is ingested into the index, the ingest pipeline runs the `sparse_encoding` processor on the document, generating vector embeddings for the `passage_text` field. The indexed document includes the `passage_text` field, which contains the original text, and the `passage_embedding` field, which contains the vector embeddings.

## Step 4: Search the index using neural search
## Step 4: Search the index using neural sparse search

To perform a neural sparse search on your index, use the `neural_sparse` query clause in [Query DSL]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/index/) queries.

The following example request uses a `neural_sparse` query to search for relevant documents:
The following example request uses a `neural_sparse` query to search for relevant documents using a raw text query:

```json
GET my-nlp-index/_search
Expand Down Expand Up @@ -241,6 +241,27 @@ The response contains the matching documents:
}
```

You can also use the `neural_sparse` query with sparse vector embeddings:
```json
GET my-nlp-index/_search
{
"query": {
"neural_sparse": {
"passage_embedding": {
"query_tokens": {
"hi" : 4.338913,
"planets" : 2.7755864,
"planet" : 5.0969057,
"mars" : 1.7405145,
"earth" : 2.6087382,
"hello" : 3.3210192
}
}
}
}
}
```

## Setting a default model on an index or field

A [`neural_sparse`]({{site.url}}{{site.baseurl}}/query-dsl/specialized/neural-sparse/) query requires a model ID for generating sparse embeddings. To eliminate passing the model ID with each neural_sparse query request, you can set a default model on index-level or field-level.
Expand Down
Loading