Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linux voices sound robotic unlike voices on m1 #280

Open
UmerTariq1 opened this issue Jun 23, 2023 · 0 comments
Open

Linux voices sound robotic unlike voices on m1 #280

UmerTariq1 opened this issue Jun 23, 2023 · 0 comments

Comments

@UmerTariq1
Copy link

Background:

I faced the exact same issue as #279 and found that its a problem pyttsx3 integration with M1.

So I switched to linux and there it seemed to work.

But upon further trying to make it work I realized that even on linux runAndWait() function gets stuck in a loop after the first run (it runs and finishes the call for the first time you call runAndWait(), but if you run it again then it gets stuck in infinite loop).
I found a workaround to this by using threads. Code is shared below.

Problem:

The problem is that voices in linux are way different than they are in macos. I cannot just use macos because as mentioned above it has other issues (#279). I understand the library uses different drivers for different OSs and for linux it uses espeak. But the voices in linux are way too robotic. They are simply unusable. this link suggested to use espeak voice id 11 but it doesnt work either..
Below I share my code (with threading to avoid runAndWait() looping problem)

What I want:

Less robotic voice. Note that the voice that i am getting now is english but its just robotic.

Code:

github link

(Formatting of the code below is a bit wrong but i dont know how to properly format it. please refer to github link for formatted code)

`import pyttsx3
from pydub import AudioSegment
import threading

class AudioGenerator:
def init(self):
self.engine = pyttsx3.init("espeak") # Initialize pyttsx3 engine with the "espeak" driver

    self.lock = threading.Lock()  # Create a lock to control thread synchronization
    self.finished = False  # Flag to indicate if speech synthesis has finished

    # Set the event handler for the end of utterance
    self.engine.connect('finished-utterance', self._on_end)

    self.engine.setProperty('rate', self.engine.getProperty('rate')-20)
    self.engine.setProperty('voice', self.engine.getProperty('voices')[11].id)

def _on_end(self, name, completed):
    self.finished = True  # Set the finished flag to indicate speech synthesis is complete
    self.lock.release()  # Release the lock to allow the main thread to proceed

def generate_audio_file(self, text, filename):
    self.lock.acquire()  # Acquire the lock to prevent the main thread from proceeding
    self.finished = False  # Reset the finished flag for the new synthesis
    #save to file is just what i do. its not necessary.
    self.engine.save_to_file(text, filename)  # Save the speech to the specified file
    self.engine.startLoop(False)  # Start the engine loop
    self.engine.iterate()  # Run the engine loop iteration
    self.engine.endLoop()  # End the engine loop

def get_audio(self,filename):
  return AudioSegment.from_file(filename, format="mp3") # returns an audo object which can be played directly `

`text = "Hi. How are you? what are you doing?"
file_path = "test.mp3" #save to file is just what i do. its not necessary.

audio_generator = AudioGenerator()
x = audio_generator.generate_audio_file(text, file_path) #save to file is just what i do. its not necessary.
audio_generator.lock.acquire() # Wait for the synthesis to complete
audio_generator.lock.release() # Release the lock
audio = audio_generator.get_audio(file_path) # i do save to file so i can do this. now you can play this audio (for example send it to frontend)
`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant