Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Streaming Inference / Application #1072

Closed
mthrok opened this issue Dec 8, 2020 · 3 comments
Closed

RFC: Streaming Inference / Application #1072

mthrok opened this issue Dec 8, 2020 · 3 comments
Labels

Comments

@mthrok
Copy link
Collaborator

mthrok commented Dec 8, 2020

Torchaudio team is looking for a way to support streaming applications. We are trying to define the problem space and scope the challenge we tackle. For this purpose, we would like to learn your thoughts and experience in streaming applications. If you have a thought, please let us know by leaving a comment.

Questions include, but not limited to

  • How are you feeding your input stream to your system?
    • WebSocket? FFMpeg + STDIN? gRPC?
  • What technology stack do you use?
    • Kaldi? ONNX? TorchScript? TensorRT? DeepStream?
  • What are the pain points in your application lifecycle?
    • Development?
    • Deployment?
    • Maintenance?
  • What type of application do you run?
    • Speech recognition?
    • Audio enhancement?
    • noise reduction
    • audio event detection
  • What kind of device is your application running on?
    • Web server?
    • Desktop system?
    • Mobile device?
    • Embedded system?
@mthrok mthrok changed the title [RFC] Streaming Application [RFC] Streaming Inference / Application Dec 8, 2020
@mthrok mthrok added the RFC label Dec 9, 2020
@tongjinle123
Copy link

1、grpc
2、onnx,torchscript
3、deployment
4、speech recognition
5、~

@mthrok mthrok changed the title [RFC] Streaming Inference / Application RFC: Streaming Inference / Application Jan 1, 2021
@mthrok
Copy link
Collaborator Author

mthrok commented Jan 5, 2021

@tongjinle123 Thanks for the comment. Is your production environment Python or C++?

@vincentqb vincentqb mentioned this issue Jan 8, 2021
mthrok pushed a commit to mthrok/audio that referenced this issue Feb 26, 2021
* Dispatcher tutorial

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* typofix

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* morefix

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
@hbredin
Copy link

hbredin commented Apr 12, 2021

As promised in #1442, here are my potential needs for pyannote.audio.
I don't have much experience in this area so bear with me if my answers fall flat.

* How are you feeding your input stream to your system?

I'd love to be able to expand this streamlit demo with the actual microphone of the end user. I guess with something like streamlit-webrtc.

* What technology stack do you use?
  * Kaldi? ONNX? TorchScript? TensorRT? DeepStream?

None of these for now... Just regular PyTorch as pyannote.audio is designed for research purposes, not actual production.

* What type of application do you run?
  • speaker diarization
  • audio event detection
* What kind of device is your application running on?
  • Web server

@mthrok mthrok closed this as completed Oct 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants