Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training without labels - adapting existing pretrained backbone #240

Open
keithkam-yk opened this issue Aug 18, 2024 · 0 comments
Open

Training without labels - adapting existing pretrained backbone #240

keithkam-yk opened this issue Aug 18, 2024 · 0 comments

Comments

@keithkam-yk
Copy link

Use case:

Question:

  1. Can I provide training data not labelled or in pairs/ triplets? What would be the recommended way to freeze the pretrained backbone and only train the linear projection + k-means sections?
  2. Is there a way to decouple the initial trained backbone embedding from this? I.e. can I embed my unsupervised training corpus up front with my backbone model and then use those embeddings as my training set, instead of raw "text" input?
    a. use case is to test many configurations of the downstream projection layer + k means indexing, with the aim of reducing the encoding GPU costs

PS. big fan of this + colbert work in general!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant