Create Stunning Visualizations With Pandas Dataframes in One Line of Code | 1
|

Create Stunning Visualizations With Pandas Dataframes in One Line of Code

A simple trick to make your charts interactive and visually appealing

Great visualization leads to excellent insights.

Almost every data scientist who uses Python also uses Pandas. It’s the de-facto Python library for data wrangling. Pandas out of the box offer some great visualization for common chart types.

But the defaults aren’t the best.

We could make it even better with a companion framework such as Plotly. We can set the plotting backend to Plotly and use its stylish charts in our projects.

But setting the backend alone doesn’t give the full benefit of Plotly for our dataframes. For example, Pandas doesn’t have a surface plot option. Also, Plotly has a slightly different way of creating charts than Pandas.

Cufflinks is another library that bridges this gap. We can use the same Pandas-like calls to create more stunning charts with it. And also we can develop advanced charts like surface plots.

Read: How to Do a Ton of Analysis in Python in the Blink of An Eye.

 

How to create plots from dataframes —the pure Pandas way.

In Pandas, if you want to create charts such as bar charts and box plots, all you have to do is call the plot method. We can specify the type of chart we need and several other configurations.

We’re creating a bar chart using the panda’s inbuilt plot function in the following example.

df.plot(kind='bar', x="month", y="passengers")
JavaScript

 

How to create plots from dataframes -the pure Pandas way
Bar chart created by Pandas using Matplotlib backend — Screenshot by author.

  The above is very straightforward. Yet its presentability isn’t that great.

Pandas’ default plotting backend is Matplotlib. It works fine in many instances. But charts can be better with a different backend.

We can quickly turn this ordinary chart into a beautiful one by changing the plotting backend to Plotly.

pip install plotly==5.5
# conda install -C plotly plotly==5.5.0
JavaScript

If Plotly hasn’t been installed on your computer already, please use the above code.

We can set the backend to Plotly with the following line. I recommend adding this line as soon as you import pandas into your Notebook (or project.)

import pandas as pd
pd.options.plotting.backend = “plotly
JavaScript

The resulting chart is more aesthetically appealing, and it has been summarized well.

How to create plots from dataframes
Barchart was created by the Plotly backend of Pandas — Screenshot by author.

 However, as I mentioned earlier, it still lacks some key charts. Let’s do it differently to tap Plotly’s maximum potential.

Read: 9 Easy Steps to Make Great Charts

 

How to create plots from a dataframe using Cufflinks

Cufflinks is a Python library that helps us use Plotly with Pandas in a native Pandas-like syntax. It also adds more impressive chart types than we normally see in Pandas dataframes.

We can import it and configure the global theme and other options once and use the iplot API that is added to every dataframe instead of its default plot method.

Let’s install it from PyPI with the following command.

pip install cufflinks
JavaScript

Once installed, we can import and configure it in our Notebook.

import pandas as pd
import cufflinks as cf
import numpy as np
cf.set_config_file(theme='pearl')
JavaScript

We can now create many different charts using the iplot method. Here’s an example.

cf.datagen.lines(4,1000).iplot()
JavaScript

We used the datagen module of the Cufflinks package. It allows us to generate random data for various situations. We’ll use it to create data for the rest of this post.

How to create plots from dataframe using Cufflinks
Line chart generated using Cufflinks, Pandas, and Plotly — -Screenshot by author.

This minor tweak dramatically improves the presentability of our charts without significantly changing our code.

Types of Plotly data visualizations we can create on a dataframe.

Plotly has several different chart types. A few of them are available through cufflinks which we can directly call from a dataframe.

Read: This is How I Create Dazzling Dashboards Purely in Python.

Here are some charts that aren’t available in Pandas but are made possible through cufflinks.

3D surface plots.

Surface plots are a visual representation of 3-dimensional data. It’s helpful in many applications. For instance, we use surface plots in machine learning to study cost functions and gradient descent optimization.

The following code creates a surface plot from a dataframe. Cufflinks pick the column and row indexes as the x and y-axis. The values of the dataframe go on the z-axis.

Dataset for the surface plot.

cf.datagen.sinwave(10,.25).iplot(kind='surface')
JavaScript

Surface plot created with Cufflinks, Plotly, and Pandas.
Surface plot created with Cufflinks, Plotly, and Pandas. 

If your dataset has all values in different columns, please use the pivot function as shown below to convert it to the compatible format before plotting.

 

Pivoting table to use in a surface plot.
Pivoting table to use in a surface plot. 

Bubble charts

Bubble charts are another great way to visualize multiple dimensions in a meaningful way. We can picture four features in a single chart, including one categorical variable.

The following chart shows how planet size changes with its distance from the sun and its mass for every planet (Fake data, of course.)

cf.datagen.bubble(prefix="planet").iplot(kind='bubble',x='x',y='y',size='size',categories='categories',text='text', xTitle='Relative Mass',yTitle='Distance from the Sun',title='Plannet Size, Mass and Distance')
JavaScript

Bubble chart created with Cufflinks, Pandas, and Plotly
Bubble chart created with Cufflinks, Pandas, and Plotly

 

Also, note that the charts you create with the Cufflinks extension are interactive. Hover over any bubble to see its details. You can click on any category to turn it on or off.

Heatmap charts

Heatmaps are often a much easier way to find out hot spots in our dataset. They are like surface plots that allow us to visualize three data dimensions simultaneously. But instead of the z-axis, here, we have a color spectrum.

Like other chart types, it’s easy to create a heatmap too.

cf.datagen.heatmap(20,20).iplot(kind='heatmap',colorscale='spectral')
JavaScript

 

Heatmap created with Cufflinks, Plotly on a Pandas dataframe
Heatmap created with Cufflinks, Plotly on a Pandas dataframe

Spread charts

Suppose you’re tracking a variable for two categories over time; you might also want to see how their differences change over time as well. You might have to see if the gap is narrowing or even flipped. Or perhaps the trend of the disparity itself.

Spread charts are an excellent way to visualize the spread between two variables over time.

cf.datagen.lines(2).iplot(kind='spread',xTitle='Dates',yTitle='Return')
JavaScript

The spread chart works just like line plots. But in addition to plotting the individual lines, it also generates an area chart beneath the line chart. They both share the same time axis, so that it’s easy to understand.

 

Spread chart generated with Cufflinks and Plotly on a Pandas dataframe
Spread chart generated with Cufflinks and Plotly on a Pandas dataframe

These are only some of the many charts you can create with the Cufflinks extension. Of course, you can also make the more common charts available from Pandas default API.

Read: How to Create Stunning Web Apps for your Data Science Projects

 

Changing the theme of charts.

Through Cufflinks configuration, you can easily switch between several color themes. We have the following options to pick one.

ggplot, pearl, solar, space, white, polar, henanigans
JavaScript

At the top of this article, we used the pearl theme when we configured Cufflinks for the first time. Here’s how we changed it to a different theme.

cf.set_config_file(theme='henanigans')
JavaScript

Here’s how the last example we used appears in other color themes.

Final thought

Visualizing makes all the differences in what we can do with the data at hand.

When working with Pandas dataframes, we primarily use its default plot method to create graphics. But these graphs aren’t styled enough to present it nicely. A quick trick is to change the plotting backend to Plotly and have beautiful charts.

However, another Plotly binding for Pandas dataframes, known as Cufflinks, adds extra possibilities to the default Pandas plotting option. With it, we can quickly switch between several preconfigured themes and unlock charts that aren’t available in Pandas.

This post discussed how to get started with Pandas, Plotly, and Cufflinks. We’ve also created some fantastic visuals of our dataframe all in one line of code.

Related:
5 Python GUI Frameworks to Create Desktop, Web, and Even Mobile Apps.
A Better Way to Summarize Pandas Dataframes.

Did you like what you read? Consider subscribing to my email newsletter because I post more like this frequently.
Thanks for reading, friend! Say Hi to me on
LinkedIn, Twitter, and Medium.

Similar Posts