Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement OpenSearch ANN #1225

Merged
merged 25 commits into from
Jul 26, 2021
Merged

Implement OpenSearch ANN #1225

merged 25 commits into from
Jul 26, 2021

Conversation

brandenchan
Copy link
Contributor

@brandenchan brandenchan commented Jun 24, 2021

OpenSearch has ANN capabilities. Let's add support for these features!

Only OpenSearch has dot product ANN, not Open Distro. Let's also make this switch.

  • Rename OpenDistroElasticsearchDocumentStore?
  • Assess whether inheritance structure works since ESDS doesn't support ANN
  • Update documentation
  • Create Opensearch util fn
  • Make ANN work
  • Benchmark
  • Handling of ES and OS ports so that benchmarks can be run

@brandenchan
Copy link
Contributor Author

OpenSearch HNSW is seemingly implemented. Simple queries can be run. However, I still need to benchmark OpenSearch HNSW to make sure that there is really a HNSW index running, and to make sure that it is actually faster. When running the benchmarking script, an error is coming up when running DocumentStore.query_by_embedding(). It seems that the body of the query request is not valid. Will need to investigate this further

@brandenchan brandenchan mentioned this pull request Jul 14, 2021
@brandenchan brandenchan self-assigned this Jul 14, 2021
@brandenchan
Copy link
Contributor Author

Fixes #1079


In addition to native Elasticsearch query & filtering, it provides efficient vector similarity search using
the KNN plugin that can scale to a large number of documents.
"""

def __init__(self, **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be more readable to have explicit arguments(with their corresponding defaults).

@@ -1053,6 +1100,9 @@ def _create_document_index(self, index_name: str):
if not self.client.indices.exists(index=index_name):
raise e

def stop_service(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to keep this in the utils instead? It can be misleading as it only works in the case when OpenSearch is running as a Docker Container running locally.

@brandenchan brandenchan merged commit 363be65 into master Jul 26, 2021
@brandenchan brandenchan deleted the odes_ann branch July 26, 2021 08:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants