How to Create a PDF Audio Reader Using Python | 1
|

How to Create a PDF Audio Reader Using Python

 You can turn any PDF file into an audiobook using Python.

PDFs (portable document format) are great for distribution because we can read them on any PDF reader device. Many readers prefer to listen to their content than read it. Fortunately, you can use Python to turn PDF documents into audio files and play them on several popular devices.

Pdf to audio conversion is helpful because it allows people who are blind or have low vision to be able to read the document. It also helps people who are in a hurry and need to quickly get through a pdf file.

I’ve seen people who frequently get migraines have difficulty reading pdf files because of the strain on their eyes. Audio conversion alleviates this strain, allowing them to access the information without wasting time or effort. Audio conversion can also be helpful in situations where you don’t have access to a device with a screen, such as an airplane or while waiting in line. Being able to listen to pdf documents makes it easier to access the information without needing a device with a screen.

The first step in creating a PDF reader with Python is downloading and installing the correct libraries. In this tutorial, we’ll be using PyPDF2 and pyttsx3. PyPDF2 is a pure Python library for PDFs that helps with reading and editing PDF files, while pyttsx3 will convert pdf documents into audio files.

Relate: How to Download YouTube Videos With Python?

These libraries are not installed by default, so we must install them using pip. We can do this by opening a command prompt or terminal and entering the following command:

pip install PyPDF2 pyttsx3
Bash

1. Read PDF files using Python

Reading files in Python is straightforward. We need to open the file using the open() function.

 

Yet, to read the contents of a PDF file, we need to install the PyPDF2 library. We can create a PdfFileReader object and pass the PDF file path. The pdf reader object ( or pdf file object) is the entry point for all our remaining work.

 

We often use PyPDF2 to read and merge entire files in a directory. Besides extracting text from PDF files, we also use it with other popular Python libraries, such as text blob and nltk, for analyzing text data.

 

Several other Python wrappers are built on top of PyPDF. A good example is PDF tools. It gives other pdf-related tools such as copying, inserting images, etc. Check it out if you need more than extracting text or adding custom data to a PDF file.

 

And here’s how simple it is to read a PDF file into a pdf file object in the Python environment.

from PyPDF2 import PdfFileReader

# create a PdfFileReader object
reader = PdfFileReader("/path/to/file.pdf")
Python

2. Get the page to read

A PDF file may contain multiple PDF pages. So we need to specify which page we want to read.

The PyPDF2 module offers two ways to work with pages. We can find the number of pages in a PDF using numPages property. The pages property returns a list of all the pages in the PDF. You can access specific pages with their list index.

number_of_pages = reader.numPages
page = reader.pages[0]
Python

You could use it to loop through the pdf pages when converting a large PDF into an audiobook.

for page in reader.pages:
    # do the rest

#or
for i in range(reader.numPages):
    page = reader.pages[i]
    
    #do the rest
Python

3. Convert PDF to text using Python

Once we decide which page we will read, we need to extract the text content from that page. In PyPDF2, we can use the extractText property to extract text.

text = page.extractText()
Python

4. Configure the pyttsx3 engine

Now that we have the text content of our PDF file, we need to convert it into audio. For this, we’ll use the pyttsx3 library.

Pyttsx3 is a text-to-speech conversion library in Python. It works offline and is compatible with multiple platforms, including Windows, Linux, and MacOSX.

The first step is to create an instance of the pyttsx3’s Engine class.

Once the engine is ready, we can set the voice (male, female), volume, and speaking rate. But these are all optional.

Set the voice—male/female

We need to get what’s available in the engine to set the voice.

voices = engine.getProperty('voices')
Python

Now, we can set the voice property on the engine. Use 0 to set the male voice and 1 for the female voice. (I know, it should be the other way around.)

#Male
engine.setProperty('voice', voices[0].id)  

#Female
engine.setProperty('voice', voices[1].id)
Python

Set reading speed

We can change the reading speed by setting the rate property to your desired words per minute.

engine.setProperty('rate', 150)
Python

Instead of setting a specific rate, if you only want to speed (or slow) the reading, you can refer to the current speed and adjust it.

# refer to the current value
rate = engine.getProperty('rate') 

engine.setProperty('rate', rate+50)
Python

Change the volume

In pyttsx3 you can set the volume between 0 and 1. 0 to mute and 1 to set the volume to its maximum.

engine.setProperty('volume',.75)
Python

5. Read or Save the audio

We’ve done all the background work. Now let’s dive into action. In pyttsx3 you have to call the say method and then the runAndWait method to do the actual speech.

engine.say(text)
engine.runAndWait()
Python

Finally, we can save the audio in MP3 format using the save_to_file() method of pyttsx3’s Engine class.

engine.save_to_file(text, 'test.mp3')
Python

Putting it all together

We’ve walked through the code line by line to understand it better. Now, the fuller version to convert PDFs and save them to a file would look like this.

import pyttsx3
from PyPDF2 import PdfFileReader


# create a PdfFileReader object
reader = PdfFileReader("/path/to/file.pdf")


# extract text from page 1 (index 0)
page = reader.pages[0]
text = page.extractText()


# Create a pyttsx3 engine
engine = pyttsx3.init()


# set the voice
voices = engine.getProperty('voices')
engine.setProperty('voice', voices[1].id)  #Female


# set the speed/rate of speech
engine.setProperty('rate', 150)


# set the volume
engine.setProperty('volume',.75)


engine.save_to_file(text, 'test.mp3')
Python

The above code is a simple example to inspire you. To recap, we

  • Create a pdf file object;
  • extracting text from the pdf reader object by picking the exact pdf page;
  • Create a pyttsx3 engine;
  • set the voice, volume, and speed settings, and;
  • saving it to a file.

You can do more than pdf file processing with these Python packages and other tools.

Final thoughts

PDF files are everywhere. But readers’ preferences also change dramatically. 

More people want to listen than read a PDF file. Because it’s possible to do something else while listening. We don’t get this flexibility when reading. Also, listening is more convenient for people with vision impairment. 

But we don’t find audio versions for every online PDF document. But Python programmers would easily convert them into audio files with a few lines of coding. We have excellent pdf processing utilities for extracting text in relatively easy syntax. Thus even people with very little prior programming knowledge can handle unstructured data sources with pdf files.

There are online pdf file readers, of course. But we may still need programmatic text-to-speech for a variety of situations.

In this article, we’ve discussed how to convert PDF files into audio files. We’ve also looked for ways to modify the speech with different volumes, voices, and speeds. 

Similar Posts