You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
#1265 introduced a document classification node FARMClassifier.
We should check whether this new node can be used in an indexing pipeline (mentioned here: https://github.com/deepset-ai/haystack/pull/1265/files#r668842533). In that case, the classifier could be used to enrich the meta data of documents at indexing time, for example, with zero-shot classification models. The FARMClassifier expects a List of Documents as input und returns a List of Documents as outputs. I don't see a reason why it should not work.
We would need to test, for example, a Converter node with a FARMClassifier
Probably it makes more sense to do it after using the PreProcessor, but we might want to keep the flexibility to use it right after the converters or after further preprocessing (e.g. document splitting).
There's a small hurdle to take, as preprocessing usually works on dicts, TransformersDocumentClassifier currently expects Document objects as input. As there is no common conversion logic from dicts to Documents throughout the different DocumentStores we might want to make TransformersDocumentClassifier also work with dicts.
In order to support custom content fields, it's better to keep the "Document only way" as only client code can provide this information. This means we have to convert the dicts to Documents within the preprocessing logic.
Should be better to introduce a new tutorial instead of extending the existing Tutorial8_Preprocessing. So we can show how the classification results dynamically added at index time can be used at query time in one tutorial.
#1265 introduced a document classification node
FARMClassifier
.We should check whether this new node can be used in an indexing pipeline (mentioned here: https://github.com/deepset-ai/haystack/pull/1265/files#r668842533). In that case, the classifier could be used to enrich the meta data of documents at indexing time, for example, with zero-shot classification models. The
FARMClassifier
expects a List of Documents as input und returns a List of Documents as outputs. I don't see a reason why it should not work.We would need to test, for example, a Converter node with a FARMClassifier
Finally, we would need to check whether the meta field of documents indexed in the document store contain class labels.
The text was updated successfully, but these errors were encountered: