Skip to content

Audio Transcriber is a tool that allows users to upload audio files via a custom API. It converts audio to WAV format, performs text transcription, and generates summaries using language models like ChatGPT. It supports multiple languages and provides the transcribed text and summaries via the API. Check the project board for planned features.

License

Notifications You must be signed in to change notification settings

Niklashere/Transcriber

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Audio Transcriber

GitHub licence Contributor Covenant

📜 Description

Audio Transcriber is a tool that allows users to upload audio files via a custom API platform. The backend processes the audio files, converting them to WAV format, and then performs a transcription in text form. Users can select different language models for precise transcriptions, and the transcribed text is further processed by a language model (e.g., ChatGPT) to generate an accurate summary. Both the transcribed text and the summary are provided via the API.

💫 Features

  • Development of a custom API platform
  • Users can upload audio files via the API
  • The backend receives the audio files and converts them to WAV format
  • Subsequently, a transcription is performed in text form
  • Integration of speech recognition and/or language selection for different languages
  • Option to select different language models (e.g., Whisper) for precise transcriptions
  • The transcribed text is sent to a language model (e.g., ChatGPT) to generate an accurate summary
  • The transcribed text and summary are provided via the API.

📝 TODO

Planned features and enhancements can be found on the project's board. Check out the board for updates and future developments.

📩 Installation

  1. Ensure Python 3.8 or later and NodeJS is installed.
  2. Clone the repository.
  3. Navigate in the console to the root folder of the repository.
  4. OPTIONAL: Create a virtual environment with python -m venv .venv if needed. Otherwise, proceed to step 6.
  5. OPTIONAL: Activate the virtual environment.
  6. Install all required Python packages by running pip install -r requirements.txt.
  7. Install PyTorch from here.
  8. OPTIONAL: When using a supported OS, install Accelerate to benefit from performance gains. Check here for instructions.
  9. OPTIONAL: Install Deepspeed to benefit from additional performance gains. Instructions can be found here.
  10. Rename config.example.py to config.py and configure to your liking.
  11. Start the backend by executing main.py in the backend folder.
  12. Start the frontend by typing ng serve into the console inside the frontend folder.

💾 Contributing

For more information on how to contribute, please visit the Contributing page.

About

Audio Transcriber is a tool that allows users to upload audio files via a custom API. It converts audio to WAV format, performs text transcription, and generates summaries using language models like ChatGPT. It supports multiple languages and provides the transcribed text and summaries via the API. Check the project board for planned features.

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published