Creating Wordclouds in Python: The Quick and Easy Guide | 1
| |

Creating Wordclouds in Python: The Quick and Easy Guide

 Word clouds (tag clouds) are a great way to visualize text data. And python makes it easy to create one. This post will cover an example of using the Wordcloud library to generate word clouds.

If you don’t know what it is, a word cloud visualizes how often words appear in a given piece of text. The more often a word appears, the larger it will be in the word cloud.

There are several free word cloud generator tools for general use. They could be instrumental if you’re preparing one for presentations or to include in a document. You don’t have to redo all the programming work. But you’d often have to generate word clouds dynamically or in large batches. This is where word cloud creations become tricky.

So, how to make word cloud programmatically? That’s the focus of this post. We will start with a basic word cloud and move on to create more advanced word cloud art.

Let’s build our phrase cloud generator (people sometimes call word cloud that way)

1. Install the WordCloud library

Wordcloud is a free and open-source Python library. As of writing this post, Wordcloud’s GitHub repo has 8.7k stars and 2.2k forks, and 62 people contribute.

You can install the WordCloud library from PyPI.

pip install wordcloud

If you’re using Poetry to manage Python packages, you can use the following command.

poetry add wordcloud

Related: How to Execute Shell Commands With Python?

2. Load Data

For this tutorial, we’ll be using the following text.

“Python is a widely used high-level interpreted language for general-purpose programming. Created by Guido van Rossum and first released in 1991, Python has a design philosophy that emphasizes code readability, notably using significant whitespace. It provides constructs that enable clear programming on both small and large scales.”

You can set this as a variable in Python, as shown below.

text = """
Python is a widely used high-level interpreted language for general-purpose programming. Created by Guido van Rossum and first released in 1991, Python has a design philosophy that emphasizes code readability, notably using significant whitespace. It provides constructs that enable clear programming on both small and large scales.
"""

However, you might have to load a lot more data than this. For instance, if you have a dataset of reviews, you have to load them in a different way.

You can create word clouds from a Pandas Data Frame. The following code prepares a combined review text by

  • reads a Pandas dataframe from a CSV
  • join all the user reviews
  • assign it to a new variable, ‘text’

text = pd.read_csv("data.csv").Reviews.str.cat()

If you don’t have a pandas dataframe, you can also read the text from a file.

with open("data.txt", "r") as file:
   text = file.read()

Either way, you should end up with a string of text.

3. Generate the word cloud

Now that we have the text data loaded, let’s build a word cloud.

Creating a word cloud is easy with the WordCloud library. The following code will create a word cloud from the text we loaded earlier.

from wordcloud import WordCloud

import matplotlib.pyplot as plt
% matplotlib inline

wordcloud = WordCloud().generate(text)

plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")

First, we import the WordCloud class from the Word cloud library. Then, we also import matplotlib. We use the %matplotlib inline magic command so that the word cloud appears inline in the notebook.

Then, we create a WordCloud instance and generate the word cloud using the text variable.

Finally, we use the plt.imshow() function to display the word cloud. The word cloud is displayed using the default settings.

 

WordCoud created in Python

 If you want to change the appearance of the word cloud, you can use different settings. For example, you can change the background color, the max_words, the max_font_size, and so on.

The following code shows how to change the background color to #e2e1eb and the max_words to 10.

wordcloud = WordCloud(background_color="#e2e1eb", max_words=10).generate(text)

plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")

As you can see, the word cloud looks different with these settings. Play around with the settings to get the word cloud that you want.

Create Word clouds in shape.

You could get creative with creating word clouds. You can set the mask option to an image to get the word clouds created in shape. The masking image should have a black object on a white background. Here’s how we made our word cloud into a heart shape.

from PIL import Image
import numpy as np

mask_img = np.array(Image.open("./Untitled design.png"))

wordcloud = WordCloud(background_color="#e2e1eb", max_words=100, mask=mask_img).generate(text)

The above code will result in the following image.

 

WordCloud created in Python in shape

 Conclusion

Word clouds are an excellent way to communicate what are the hot topics.

We’ve briefly discussed how we can create a word cloud and export it into PNG using Python. We’ve also created word clouds in different shapes.

Here’s what the complete code will look like.

import pandas as pd
from wordcloud import WordCloud

import matplotlib.pyplot as plt
% matplotlib inline

text = """
Python is a widely used high-level interpreted language for general-purpose programming. Created by Guido van Rossum and first released in 1991, Python has a design philosophy that emphasizes code readability, notably using significant whitespace. It provides constructs that enable clear programming on both small and large scales.
"""

# Read and convert an mask image. It should have a white (not transparent) background with a black object.
mask_img = np.array(Image.open("./heart.png"))

#
wordcloud = WordCloud(background_color="#e2e1eb", max_words=100, mask=mask_img).generate(text)

# store to file
wordcloud.to_file("wc.png")

# Show the image
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")

Generating word clouds in Python is easy. But there are free word cloud generators on the internet too. These can help if you create a tag cloud for a single use, like for a presentation. For instance, Word Art is the most creative best word cloud generator.

Some of these tools even can generate clickable word clouds. But these free word cloud generators have serious limitations regarding scalability and programmability.

But when you need to create a bulk of word clouds at once or periodically create them, you might need programmatic word cloud creation. This is where our Python word cloud generator comes in handy. You, too, may need your word cloud generator.

Similar Posts