Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Roadmap ahead for torchaudio #1196

Closed
vincentqb opened this issue Jan 25, 2021 · 0 comments
Closed

Roadmap ahead for torchaudio #1196

vincentqb opened this issue Jan 25, 2021 · 0 comments

Comments

@vincentqb
Copy link
Contributor

vincentqb commented Jan 25, 2021

There are many exciting work elements that are planned for torchaudio.

  • Provide support for large scale training.
    • Support a large-scale training reference task using wav2vec on librivox, and offer a pre-trained version of the model.
    • Support the emergence of audio specific transformer models by exploring abstractions would be beneficial to provide.
  • Extend support for speech recognition.
    • Investigate the addition of beam search, and a 4-gram language model, see here and here, to reduce the word error rate in the existing pipeline.
    • ✅ Support in-memory codec encoding and decoding, see here, to support codec based data augmentation.
    • ✅ Add the Kaldi pitch feature, see here, that is used in the audio community.
    • Implement a prototype of WFST-based ASR model, using GTN or K2, see here.
    • Add RNN transducer loss, see here and follow-up, to train RNN transducer models efficiently.
  • Provide high-performance data loading and media decoding experience.
    • Provide fast audio I/O module, see here.
    • Provide audio streaming abstractions with examples, see here.
  • Improve our codebase
    • ✅ Create libtorchaudio by building the C++ extension outside of Python, see here.

The goal of torchaudio is to accelerate research through novel, production-ready building blocks. As such, we would love to hear feedback on the plan, so make sure to reach out to us, @mthrok and @vincentqb!

cc internal

@mthrok mthrok pinned this issue Jan 25, 2021
@mthrok mthrok closed this as completed Jul 22, 2021
@mthrok mthrok unpinned this issue Jul 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants