Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow list of files in Dependencies methods #370

Merged
merged 4 commits into from
Feb 13, 2024
Merged

Allow list of files in Dependencies methods #370

merged 4 commits into from
Feb 13, 2024

Conversation

hagenw
Copy link
Member

@hagenw hagenw commented Feb 12, 2024

This add support for passing on a list of files to all file based methods of audb.Dependencies like audb.Depednencies.archive(files).

I decided against returning always a list as results to not break backwards compatibility.
Now you always get a list back when you use a list of files as a request.
If you request with an empty list, you will also get an empty list as results.
If you request a non-existing file, it will fail with the same error in both cases.

This provides the possibility to access a huge amount of files much faster than by [deps.archive(file) for file in files].
At the moment the benchmarks show that the speed enhancement can only be observed for the case of storing strings as object dtype. But I checked already that the same speed for the two other benchmark columns can be achieved when using always object as dtype for the index.

image

image

Example change to docstring:

image

The actual code inside audb is not yet changed to take advantage of the new files argument as I first will tackle the speed issue we see when using it with storing strings as string dtype.

@hagenw hagenw changed the base branch from main to dev February 12, 2024 16:09
Copy link

codecov bot commented Feb 13, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (52c631b) 100.0% compared to head (5521086) 100.0%.

Additional details and impacted files
Files Coverage Δ
audb/core/dependencies.py 100.0% <100.0%> (ø)

@hagenw hagenw marked this pull request as ready for review February 13, 2024 09:59
@hagenw hagenw merged commit dd7642e into dev Feb 13, 2024
9 checks passed
@hagenw hagenw deleted the dependencies-files branch February 13, 2024 10:06
hagenw added a commit that referenced this pull request Feb 23, 2024
* Allow list of files in Dependencies methods

* Add first tests

* Extend tests

* Update docstrings
hagenw added a commit that referenced this pull request May 3, 2024
hagenw added a commit that referenced this pull request May 3, 2024
* Allow list of files in Dependencies methods

* Add first tests

* Extend tests

* Update docstrings
hagenw added a commit that referenced this pull request May 3, 2024
hagenw added a commit that referenced this pull request May 3, 2024
hagenw added a commit that referenced this pull request May 3, 2024
* Allow list of files in Dependencies methods

* Add first tests

* Extend tests

* Update docstrings
hagenw added a commit that referenced this pull request May 3, 2024
hagenw added a commit that referenced this pull request May 3, 2024
* Allow list of files in Dependencies methods

* Add first tests

* Extend tests

* Update docstrings
hagenw added a commit that referenced this pull request May 3, 2024
hagenw added a commit that referenced this pull request May 8, 2024
* Allow list of files in Dependencies methods

* Add first tests

* Extend tests

* Update docstrings
hagenw added a commit that referenced this pull request May 8, 2024
@hagenw hagenw restored the dependencies-files branch May 29, 2024 13:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant