-
Notifications
You must be signed in to change notification settings - Fork 643
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support file-like object in info #1108
Conversation
6937235
to
37a2654
Compare
5809424
to
22398a3
Compare
d15a004
to
55ae6f4
Compare
In case you don't like file-like object, file-object is actually defined to be a synonym in the python glosassry. Same with stream. https://docs.python.org/3/glossary.html#term-file-object |
I'm a fan. Using pybind11 in eager mode and torchbind for deployment only really is opening a lot of doors. What's the error behavior when someone is using a pybind11 feature that torchbind does not support? For now we've been using this dual registration approach for performance reasons since pybind11 has lower eagermode overhead. |
|
Ah, so pybind is picked for performance reason then? I know we observe performance difference in some context, but did we measure a significant difference here? There's a note in the description about being ok with low performance in the description for part of the PR. |
It's been picked for performance reasons in the context of torchtext where the dual pybind11/torchbind approach originated. Using it for the purposes of getting eagermode-only features is new. |
well, actually, I prefer the notion of file-
So the behavior is,
|
yes
It is tested. Checkout the tests added. |
I meant: there are tests for the load functions within both sox_io and soundfile backends, but not one calling directly the |
Check out how the backend tests are written. |
Makes sense. We could either as a follow-up or using this as a vehicle look into how we can improve the error message, if necessary, when someone tries to convert a model into TorchScript that calls these functions using bytes. |
6e369c3
to
b2793c7
Compare
Well, once the function is scripted, we cannot do much about it, because TorchScript runtime won't run the function if the input type is invalid. However it seems that the error message is clear if the reason for the failure is wrong type. foo.pyimport torch
import torchaudio
path = 'test/torchaudio_unittest/assets/sinewave.wav'
# eager execution - works
with open(path, 'rb') as file_:
torchaudio.load(file_)
load = torch.jit.script(torchaudio.load)
# jit execution with str - works
load(path)
# jit execution with file-like obj - does not work.
with open(path, 'rb') as file_:
load(file_)
|
@cpuhrsch Update the docstring to mention the difference between scripted/eager mode. |
28c98d9
to
22aadef
Compare
Hi @mthrok, I wonder if the parameters |
Hi @igorgad Thanks for the question, the answer is (unfortunately) no. The goal of this PR is to extend the support to Python's file object, which is the protocol Then the question is, can we (relatively easily) extend the support for such efficient network data transfer? I think the answer is no, because of the limitation Maybe If you want that feature, maybe you can file an issue with feature request and provide your workflow? (total length of the audio file and the protocol you are using to fetch the data) Though I cannot guarantee that we can full fill your request, but it will certainly help us understand your demand. |
Yes. Indeed I think that plugging a network utility into libsox isn't the best option. I investigated the use of FFmpeg to load files from GCS buckets, but it doesn't work unless you set a public URL. I'm currently investigating options to scale the training of our models on a managed cloud infrastructure, and one of the options is the Ai-Platform running custom containers. The problem is that we cannot mount NFS shares with our data on the containers due to the lack of the privileged run mode. The dataset is also big to copy from a remote path at the beginning of each experiment. This way, and inspired by this implementation, would you consider a Thanks |
Your question made me realized that I did not test on those frame slicing. While I was adding test, I looked into the read behavior, and turned out that Regarding Giving some thoughts on this, I think the process can be broken down into the following steps;
Then here are the questions that came up to me;
As you mention, if one knows that the format is WAV, then solving 2 will be easy, but solving it for general case will be difficult. Also with cloud provider, one has to think of a way to pass access tokens to the object that is making the network access. So I think, for now, to solve your problem, having your own custom implementation is the easiest solution. |
c3e51cf
to
19dd571
Compare
6367777
to
1abe038
Compare
c40edf3
to
3e1b33c
Compare
956c84b
to
f57d1a3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! I left only a non-blocking comment.
// See: | ||
// https://xiph.org/vorbis/doc/Vorbis_I_spec.html | ||
auto capacity = 4096; | ||
std::string buffer(capacity, '\0'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as other PR. Perhaps there is a better type buffer than std:string here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR adds support for file-like object to
info
function ofsox_io
backend andsoundfile
backends.See also #1115