Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Orama search #162

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

feat: Orama search #162

wants to merge 5 commits into from

Conversation

PuruVJ
Copy link
Collaborator

@PuruVJ PuruVJ commented Jun 29, 2023

No description provided.

@changeset-bot
Copy link

changeset-bot bot commented Jun 29, 2023

⚠️ No Changeset found

Latest commit: 7d47065

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@benmccann
Copy link
Member

Rather than doing our own sort of the search results after they're returned, let's do it all within the search library. I got it working pretty well in this StackBlitz demo, so you should just be able to copy it: https://stackblitz.com/edit/stackblitz-starters-8wuzli?file=index.js

Here's the code for reference:

import { create, insertMultiple, search } from '@orama/orama';
import { readFileSync } from 'fs';
const query = process.argv[2];

async function run_search() {
  const blocks = JSON.parse(readFileSync('content.json'));
  blocks.forEach((b) => {
    if (b.breadcrumbs && b.breadcrumbs.length >= 1 && b.breadcrumbs[0]) {
      b.h1 = b.breadcrumbs[0];
    }
    if (b.breadcrumbs && b.breadcrumbs.length >= 2 && b.breadcrumbs[1]) {
      b.h2 = b.breadcrumbs[1];
    }
    if (b.breadcrumbs && b.breadcrumbs.length >= 3 && b.breadcrumbs[2]) {
      b.h3 = b.breadcrumbs[2];
    }
    delete b.breadcrumbs;

    if (b.href.startsWith('/docs/migrating')) {
      b.priority = 1;
    } else if (
      b.href.startsWith('/docs/types') ||
      b.href.startsWith('/docs/modules') ||
      b.href.startsWith('/docs/glossary')
    ) {
      b.priority = 2;
    } else {
      b.priority = 3;
    }
  });

  const index = await create({
    schema: {
      content: 'string',
      h1: 'string',
      h2: 'string',
      h3: 'string',
      priority: 'number',
    },
    components: {
      tokenizer: { language: 'english', stemming: true },
    },
  });

  await insertMultiple(index, blocks);

  const results = await search(index, {
    term: query,
    sortBy: (a, b) => {
      const [_docIdA, scoreA, docA] = a;
      const [_docIdB, scoreB, docB] = b;
      return docB.priority * 1000 + scoreB - (docA.priority * 1000 + scoreA);
    },
    boost: {
      h1: 4,
      h2: 3,
      h3: 2,
    },
    limit: blocks.length,
  });

  console.log(results.hits.map(({ document }) => document));
}

run_search();

Copy link

@allevo allevo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi! I just put a comment.

},
components: {
tokenizer: { language: 'english', stemming: false }
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the code uses a custom sort. So, the default sorting can be turned off, speeding up the insertion time and performance.
https://docs.oramasearch.com/usage/search/sorting#disable-sort

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha!! good idea

@PuruVJ
Copy link
Collaborator Author

PuruVJ commented Jun 30, 2023

I tried using ur concept. It feels the same, but now everything under Types sections is gone. Im looking into how to get em back

CleanShot.2023-06-30.at.17.21.57.mp4

Comment on lines +119 to +122
// console.log(search_results);
// console.log(results);

const results = tree([], blocks).children;
console.log(buildBlockTree(blocks));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should remove these few lines

const blocks = [];

for (const result of search_results) {
// @ts-ignore
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this @ts-ignore needed? it doesn't look like it would be, but if so, then perhaps it should be @ts-expect-error?

Copy link
Member

@benmccann benmccann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. just a few minor comments

@benmccann
Copy link
Member

I just tried searching let: on the current svelte.dev and it's pretty painful. I had to give up and click through every page doing a Ctrl+F. Hopefully we could support that as part of this change

@micheleriva
Copy link

Hi @benmccann, can we help in any way to get this PR go through?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants