Python Projects

Convert Text into Audio Using Python

Text to speech converter is a technology used to convert text into human voice generated from the computer. Speech recognition also makes use of artificial intelligence (AI). In this article, I will briefly guide you on the ways to convert text to speech using the Python programming language.

Why Python?

In today’s world, Python is the topmost programming language. It can be easily understood and debugged by beginner programmers. And due to its large community, it has a lot of modules that make it popular among programmers.

Getting Started

In this article, we will learn 2 ways to convert text to speech in Python. Both libraries have their own cons and pros we will briefly take a look at both.

pyttsx3

Pyttsx3 module in Python is used to convert text to speech, unlike other libraries it works offline. It supports multiple voice engines also Microsoft windows engines also. To install this module copy the following command to your command prompt or Terminal.

pip install pyttsx3

Check the below code for an example.

https://python.plainenglish.io/media/873f642212b242a5813c23e09f104379

Let break down the code.

First, we had imported the pyttsx3 module. Next, we had to set up the engine for that we call the class init() and store it in the engine variable. After this I gave a text to the engine for conversation and for this I had used sat() class by passing a string text as a parameter and in the end, I call run class that will generate an audio file and plays it.

The next section will focus on features of Pyttsx3.

Voice Speaking Rate

Pyttsx3 is a computer-generated human voice that sometimes can be robotic because humans not always speak slowly. We can increase the speed of speaking by setting the rate.

https://python.plainenglish.io/media/27945bc791e4b97ea841f7464429a436

Our setting Rate code is started from line 5 where we are checking the correct rate of speed of voice and then on line 7 we had used setProperty() class and passing the property name and its value. When you finish this and try to add text converter code lines in it you will see a difference between the rate of speed of speaking.

Increase Volume

We had an option to increase the volume of our computer-generated voice. We had to use setProperty() class, again and this time we will pass property name volume and its value. Check out the following code example to get clear about this.

https://python.plainenglish.io/media/856296c2127e108909b310e455cfa95b

The volume values are between 0 and 1 which means you can’t go above the “1” instead you can set 0.5, 0.4 or etc. So the minimum volume is 0 and maximum volume is 1.

Voice Gender

This feature will help you to change your human robot voice on the basis of gender. Pyttsx3 usually represents male voice to number 0 and female voice to 1. To change the voice gender we had to use setProperty class by passing the property voice and setting a voice list with an index number.

https://python.plainenglish.io/media/986903a3d278f7098b0856f40c684987

Human Voice changing

We can change the voice in Pyttsx3 by setting the engine voice. Following engine voice, you can use Pyttsx3.

The library supports the following engines:

  1. sapi5 — SAPI5 on Windows
  2. nsss — NSSpeechSynthesizer on Mac OS X
  3. espeak — eSpeak on every other platform

If espeak is not very natural you can try sapi5 if you are on Windows or nsss if you are on Mac OS X. I mention a code example to change the voice.

https://python.plainenglish.io/media/91e9f57ee146711d92c1011406976968

Saving audio voice file

Usually, the Pyttsx3 module plays the voice after conversion. but you will not find this voice file. We had a method in Pyttsx3 that will save the audio file after the conversation.

https://python.plainenglish.io/media/8a2ab8e13d94140d92d1b7692623aec2

Now if you had seen the above code. You had notice instead of using say() the method we had to use save_to_file the method in which we are passing the text and the filename.

gTTS Library

gTTS the library is a Python library that is a Google text-to-speech API wrapper. It has only a female voice but it had a much natural voice than other Text-to-Speech libraries. To install this module copy the following command to your command prompt or terminal.

pip install gTTS

https://python.plainenglish.io/media/504923351f7249c8a00733800d08a9a2

Let break down the code. First thing first import the required module and next we had used gTTS method and pass a String Text into it. And next, we had to save the converted voice file.

But always we don’t need an audio file to be just store we need it to play after the conversation for that we had to use a supportive module name playsound. We will call the audio file and this module will play it. Let modified our code so that after a conversation the voice should play. To install the module using the following command.

pip install playsound

https://python.plainenglish.io/media/03b558978182fa78031c31eddc37a826

Final Thoughts

So far you have learned two ways to convert text to speech but we also have some other modules and you can learn about them. These methods are quite popular in the Python programming language. I hope this article will help you in the future, and feel free to share your response. Happy coding!

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button