vad-transcriber

The Python script in this repo performs transcription on long wav files. They take in a wav file of any duration, use the WebRTC Voice Activity Detector (VAD) to split it into smaller chunks and finally save a consolidated transcript.

1. Command line tool

The command line tool processes a wav file of any duration and returns a trancript which will the saved in the same directory as the input audio file.

The command line tool gives you control over the aggressiveness of the VAD. Set the aggressiveness mode, to an integer between 0 and 3. 0 being the least aggressive about filtering out non-speech, 3 is the most aggressive.

You may also specify the target language that the ASR engine will use. The default language is American English.

$ python audioTranscript_cmd.py --aggressive 1 --audio path/to/audio.wav --lang zh-CN

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
tools		tools
README.md		README.md
audioTranscript_cmd.py		audioTranscript_cmd.py
chinese_dict.txt		chinese_dict.txt
dict.txt		dict.txt
requirements.txt		requirements.txt
wavSplit.py		wavSplit.py
wavTranscriber.py		wavTranscriber.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

vad-transcriber

1. Command line tool

About

Releases

Packages

Contributors 2

Languages

bing-jian/vad-transcriber

Folders and files

Latest commit

History

Repository files navigation

vad-transcriber

1. Command line tool

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages