-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: update dependencies and remove pins #3147
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello! Thanks for another important change like this 😊 However there are some things to fix.
-
Please revert the doc changes. It's mostly indentation, and in some places it looks really odd. I'd prefer to keep the old version.
-
Some of the removed pins are a bit dangerous, so let's dig into how you tested them. I'll do my tests separately to verify it's all good before approving.
-
I've seen you upgraded some containers, but the upgrades don't match well across the tutorials, the tests and Haystack utils. Let's make sure all versions are all in sync everywhere.
I've added some more specific questions in the comments.
@@ -62,9 +62,9 @@ dependencies = [ | |||
"mmh3", # fast hashing function (murmurhash3) | |||
"quantulum3", # quantities extraction from text | |||
"posthog", # telemetry | |||
"azure-ai-formrecognizer==3.2.0b2", # forms reader | |||
"azure-ai-formrecognizer>=3.2.0b2", # forms reader |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to make sure, have you actually tried to install a later version of this package and run tests and tutorials with that? I have a vague memory that we pinned this beta release for a good reason, even though I can't recall why right now. @bogdankostic do you know more about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ZanSara When the code has been last updated, the azure-ai-formrecognizer latest version available was b2. The other versions would not have the DocumentAnalysisClient available.
from azure.ai.formrecognizer import DocumentAnalysisClient, AnalyzeResult
pip would not force a beta over a GA. So, without the pin, 3.1.2 was being chosen. The new versions all have DocumentAnalysisClient available and some improvements (support for new API version). Despise MS released latest as GA, pip has yet the b6 tag there.
# audio's espnet-model-zoo requires huggingface-hub version <0.8 while we need >=0.5 to be able to use create_repo in FARMReader | ||
"huggingface-hub<0.8.0,>=0.5.0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did you test this, specifically? Did the audio nodes install and work properly once you installed without this pin?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Audio tests are had been run manually. Furthermore, going to the espnet-model-zoo, you will see that there is no need for lower hub versions anymore. Isn't there on CI any audio tests also? I don't remember.
Anyway, I have run then locally.
This comes along with the latest versions of pydoc-markdown, it's something we need to live with eventually. |
@ZanSara The documentation update is the new pydoc-markdown standard, it wasn't done by me (directly). It's intention, I suppose, is to make documentation easier to read as we will get one parameter per line. User can see easily know how the methods are called. |
@masci @danielbichuetti thank you for the clarification on the docs format change. I didn't know about it. Let's keep it then! 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary for future reference:
- Tika: upgraded to 1.28.4
- Opensearch: upgraded to 1.3.5
- Weaviate: upgrade leftover from 1.11 to 1.14
Looks good to me! 👍 As soon as the CI completes this can be merged.
@danielbichuetti can I ask why Tika wasn't changed to 2.x/latest? If there's an incompatibility, perhaps a note can be left somewhere in the code or documentation to prevent changing it? Or maybe an issue opened to invite/track work to make Haystack compatible with it? Surely there's a massive difference between 1.28 and 2.x |
For further notes regarding Tika 2.x update:
Here is the issue that tracks the update on the module repo: |
Related Issues
Proposed Changes:
After a recent issue opened on the repository I decided to take a look at the dependencies that are pinned. After discovering the reasons why each one has been pinned, I started looking over the dependency chain and dependency releases. Indeed, some contain bug fixes, better memory management, performance modifications, and some refactored code.
Containers versions used in samples and tests have been looked over for newer versions, even if this doesn't affect haystack usage, some users may erroneously think that the outdated versions are a need. Despise been not.
So, I updated all possible versions that will fit in the whole project dependency chain.
How did you test it?
Repository tests have been run here: https://github.com/danielbichuetti/haystack/actions/runs/2988323504
Notes for the reviewer
Checklist