Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support searchable attributes for unstructured indexes #968

Merged
merged 81 commits into from
Oct 7, 2024

Conversation

papa99do
Copy link
Collaborator

@papa99do papa99do commented Sep 15, 2024

  • What kind of change does this PR introduce? (Bug fix, feature, docs update, ...)
    Feature

  • What is the current behavior? (You can also link to an open issue here)
    Searchable attributes is not supported for unstructured index since we store embeddings of all tensor fields in one field, and all string fields in one vespa field.

  • What is the new behavior (if this is a feature change)?
    Searchable attributes is for unstructured index. When new documents are added to an unstructured index, new lexical and tensor fields are dynamically added to the index. This allows user to specify searchable attributes to selectively search on certain fields when using tensor, lexical or hybrid search.

  • Does this PR introduce a breaking change? (What changes might users need to make in their application due to this PR?)
    No

  • Have unit tests been run against this PR? (Has there also been any additional testing?)
    Yes

  • Related Python client changes (link commit/PR here)
    N/A. Client will remain the same

  • Related documentation changes (link commit/PR here)
    To be provided

  • Other information:
    This PR also include the following changes which are required to add this feature

  • A refactoring of the add_document logic
  • Extracted add_document method in the base MarqoTestCase. it force refresh the index after adding docs to make sure the following get and search requests always get the latest index settings in all tests.
  • Introduced a SemiStructuredMarqoIndex type and corresponding VespaIndex implementation. This type of index will be created when an UnstructuredMarqoIndexRequest is received going forward. The UnstructuredMarqoIndex will be kept so unstructured index created before this release can still work as is.
  • Please check if the PR fulfills these requirements
  • The commit message follows our guidelines
  • Tests for the changes have been added (for bug fixes/features)
  • Docs have been added / updated (for bug fixes / features)

@papa99do papa99do force-pushed the yihan/searchable-attributes-unstructured branch from 912d57f to 93a67da Compare September 16, 2024 00:52
@papa99do papa99do force-pushed the yihan/searchable-attributes-unstructured branch from 93a67da to 420b6c4 Compare September 16, 2024 06:15
@papa99do papa99do force-pushed the yihan/searchable-attributes-unstructured branch from 420b6c4 to 31aac27 Compare September 16, 2024 10:43
@papa99do papa99do force-pushed the yihan/searchable-attributes-unstructured branch from 31aac27 to 81e193e Compare September 16, 2024 10:53
@papa99do papa99do force-pushed the yihan/searchable-attributes-unstructured branch from 245ab62 to 5ad91dd Compare September 17, 2024 07:27
@papa99do papa99do force-pushed the yihan/searchable-attributes-unstructured branch from 5ad91dd to 38396cb Compare September 17, 2024 10:26
@papa99do papa99do merged commit fcbc982 into mainline Oct 7, 2024
5 of 6 checks passed
@papa99do papa99do deleted the yihan/searchable-attributes-unstructured branch October 7, 2024 03:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants