Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore setting up Elasticsearch document store #11

Open
4 of 5 tasks
ugm2 opened this issue Sep 25, 2022 · 1 comment
Open
4 of 5 tasks

Explore setting up Elasticsearch document store #11

ugm2 opened this issue Sep 25, 2022 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@ugm2
Copy link
Owner

ugm2 commented Sep 25, 2022

Since Elasticsearch needs a special setup for it to work (you need to spin up an Elasticsearch instance), this issue aims at exploring a way of doing that for Huggingface Spaces demo.

Using this Document Store will enable the use of BM25 keyword search algorithm, which outperforms TFIDF (currently the one we are using).

This HF space demo seems to have been able to set it up, it could be used as an example. It seems it's using Environment Variables that are internal in Huggingface. The caveat of this is that when running this demo locally, the user needs to have an Elasticsearch instance running and the env variables must be set.

Definition of done:

  • Set up elasticsearch on HF Spaces
  • Use it as Document Store
  • Use BM25 Retriever
  • Document in readme how to run the demo locally using Elasticsearch
  • Think about a workaround if Elasticsearch is not running by defaulting to TFIDF and InMemoryStore with a warning message
@ugm2 ugm2 added the enhancement New feature or request label Sep 25, 2022
@ugm2 ugm2 self-assigned this Sep 25, 2022
@anakin87
Copy link

Hey @ugm2,
some months ago I contributed to Haystack with:

They could be useful for your project...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants