Data Visualization is an important discipline for interactive presentation and plotting of data and information. With the increase in the amount of data, more companies have started adopting data analysis techniques to find patterns and trends in data. Such analysis requires data visualization as a mechanism to communicate the results effectively. Data Visualization in Python is done with the help of various elements such as graphs, charts, and maps.

Data Visualization in Python (With Tutorials)

Tools for Data Visualization in Python

Python has a flexible range of tools and techniques for visualizing various types of data. Some of the popular ones are as follows:

1. Matplotlib– Matplotlib is a visualization library in Python. It provides a variety of tools to generate different kinds of plots, histograms, bar charts, scatterplots, etc., in a few lines of code.

2. Seaborn – Seaborn is also a visualization library in Python and is based on Matplotlib. It can generate a high-level interface for interactive graphics and plots.

3. Pandas – Apart from data manipulation tools, the Pandas library offers a built-in data visualization function. It can be used to generate basic plots such as bar, scatter, histogram, boxplots, etc.

4. ggplot – The ggplot is a system for creating graphics, and is an implementation of the Grammar of Graphics. It is commonly used by the R Community.

5. Plotly – Plotly is an interactive data visualization library in Python. It can be used for graphing various plots in Python.

6. Bokeh – Bokeh is based on the concept of graphs that are built up one layer at a time. It is used to build interactive graphical plots in Python.

Types of Data Visualization in Python

There are various methods to represent data. Data Visualization plays a major role in communicating the result of any kind of analysis or study. Here, we will be using the matplotlib library in Python to build the concept of data visualization.

Listed below are some of the major kinds of Data Visualization that can be plotted and used in Data Science.

1. Line plots

Line plots are the simplest plots in Data Visualization. It plots the trajectory of points of x and y coordinates. Hance, it is commonly used to plot equations. It can be plotted by using plot() function of the matplotlib module.

A simple example of line plot in Python is given below.

import matplotlib.pyplot as plt

# Creating data
x = [1, 2, 3, 4]
y = [1, 2, 3, 4]

# Plotting the line
plt.plot(x, y)

# Setting the title and labels
plt.title("Sample line plot")
plt.xlabel('x-axis')
plt.ylabel('y-axis')

# Display the plot
plt.show()
Data Visualization in Python line plot

Note: For plotting in Jupyter notebook, please add a line%matplotlib inline after importing the matplotlib module.

Python code for Plotting 2 lines:

import matplotlib.pyplot as plt

# Coordinates for line1
x1 = [1, 2, 3, 4]
y1 = [1, 2, 3, 4]

# Coordinates for line2
x2 = [1, 2, 3, 4]
y2 = [1, 4, 9, 16]

# PLotting the lines
plt.plot(x1, y1, 'r--')
plt.plot(x2, y2)

# Setting the title and labels
plt.title("Sample line plot for 2 plots")
plt.xlabel('x-axis')
plt.ylabel('y-axis')

# Display the plot
plt.show()
Data Visualization in Python line plot for two lines

2. Bar charts

Bar charts are used to plot and compare between different values of data. It can be plotted by using bar() function of the matplotlib module.

A simple example of bar chart in Python is given below.

import numpy as np
import matplotlib.pyplot as plt

languages = ('Python', 'C++', 'Java', 'Perl', 'Scala', 'Lisp')
y_pos = np.arange(len(languages))
popularity = [10, 8, 6, 4, 2, 1]

# Plot the bar chart
plt.bar(y_pos, popularity, align='center')

# Setting the title and labels
plt.title('Programming language statistics')
plt.xlabel('Languages')
plt.ylabel('Popularity')
plt.xticks(y_pos, languages)

# Display the plot
plt.show()
Data Visualization in Python bar plot

3. Stacked bar charts

The stacked bar chart is a modified version of the bar chart. Such charts are used to visualize values of subgroups within a group of values. It can be plotted by using bar() function of the matplotlib module.

A simple example of stacked bar chart in Python is given below.

import matplotlib.pyplot as plt

# Data values
year = [2014, 2015, 2016, 2017, 2018, 2019]  
men =  [39, 77, 98, 54, 28, 15]  
women = [ 3, 10, 13, 56, 39, 14]  

# Setting the figure size
plt.figure(figsize=(10, 8))

# Plotting the bar chart for men
p1 = plt.bar(year, men, color="#fba500")

# Plotting the second bar chart on top of the previous one
p2 = plt.bar(year, women, bottom = men, color="#3792cb")

# Setting the title and labels
plt.title("Sample Stacked bar chart")
plt.xlabel('Year')  
plt.ylabel('Number of Students Graduated')

# Setting the legend
plt.legend((p1[0], p2[0]), ('Men', 'Women'))

# Display the plot
plt.show() 

4. Histogram

Histograms are commonly used to plot data over an interval. In addition, it plots the frequency of values over an interval. It can be plotted by using the hist() function of the matplotlib module.

A simple example of a histogram in Python is given below.

import matplotlib.pyplot as plt
import numpy as np

# Setting the figure size
plt.figure(figsize=(10, 8))

# Randomizing values of x
x = np.random.normal(size = 100)

# Plotting the histogram
plt.hist(x, normed=True, bins=10)

# Setting labels
plt.title("Sample Histogram Plot")
plt.xlabel('x-axis')
plt.ylabel('y-axis')

# Display the plot
plt.show()
Data Visualization in Python histogram plot

5. Scatter plots

Scatter plots are commonly used to plot data for two or more classes. Moreover, it is used to classify and view the behavior of the data. It can be plotted by using scatter() function of the matplotlib module.

A simple example of a scatter plot in Python is given below.

import matplotlib.pyplot as plt

marks_boys = [30, 29, 49, 48, 100, 48, 38, 45, 20, 30]
marks_girls = [89, 90, 70, 89, 100, 80, 90, 100, 80, 34]
marks_range = [10, 20, 30, 40, 50, 60, 70, 80, 90, 100]

# Setting the figure size
plt.figure(figsize=(10, 8))

# Plotting the scatter plot
plt.scatter(marks_range, marks_boys, color='b')
plt.scatter(marks_range, marks_girls, color='r', )

# Setting title and labels
plt.title('Sample scatter plot')
plt.xlabel('Marks Range')
plt.ylabel('Marks Obtained')

# Setting the legend
plt.legend(('Boys', 'Girls'))

# Displaying the plot
plt.show()
Data Visualization in Python scatter plot

6. Area plots

Area plots are used to plot areas between line plots and the axes. It can be plotted by using fill_between() function of the matplotlib module.

A simple example of an area plot in Python is given below.

import numpy as np
import matplotlib.pyplot as plt
 
# Create data
x = range(0, 4)
y = [1, 2, 6, 3]
 
# Setting the figure size
plt.figure(figsize=(10, 8))

# Area plot
plt.fill_between(x, y, alpha = 0.6)

# Setting title and labels
plt.title('Sample area plot')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')

# Displaying the plot
plt.show()
Data Visualization in Python area plots

In Summary

These are the basic plots that are commonly used in Python for data science. What is your go-to library for data visualization in Python? Let us know in the comment section below.

Looking to build your skills in Data Visualization? Start this Complete Data Visualization Course to learn in-depth about Data Visualization in Python.

LEAVE A REPLY

Please enter your comment!
Please enter your name here