How to connect scatterplot points with line using matplotlib
Categories:
Connecting Scatter Plot Points with Lines in Matplotlib

Learn how to create scatter plots and connect their points with lines using Matplotlib, a fundamental skill for visualizing sequential data or trends.
Matplotlib is a powerful plotting library in Python, widely used for creating static, animated, and interactive visualizations. While scatter plots are excellent for showing the distribution of individual data points, sometimes you need to illustrate a sequence or a trend by connecting these points with lines. This article will guide you through the process of combining scatter plot functionality with line plots to achieve this effect, providing clear examples and best practices.
Understanding Scatter Plots and Line Plots
Before diving into connecting points, it's crucial to understand the distinct roles of scatter plots and line plots. A scatter plot uses markers to display values for two variables, ideal for showing relationships or clusters in data. A line plot, on the other hand, connects data points with lines, typically used to visualize data over a continuous interval or to show trends over time. Combining these two allows you to highlight individual data points while simultaneously illustrating the underlying sequence or trend.
graph TD A[Data Points] --> B{Plot Type?} B -->|Individual Relationship| C[Scatter Plot] B -->|Sequential Trend| D[Line Plot] B -->|Individual + Sequential| E[Scatter + Line Plot] C --> F[Visualize Distribution] D --> G[Visualize Trend] E --> H[Visualize Distribution & Trend]
Decision flow for choosing plot types based on data visualization goals.
Basic Connection: Plotting and Scattering Together
The most straightforward way to connect scatter plot points with lines is to use both plt.plot()
and plt.scatter()
on the same axes. The plt.plot()
function, when given two arrays, will draw a line connecting the points in the order they appear in the arrays. By calling plt.scatter()
afterward, you can overlay distinct markers on top of these connected points.
import matplotlib.pyplot as plt
import numpy as np
# Sample data
x = np.array([1, 2, 3, 4, 5])
y = np.array([2, 5, 3, 6, 4])
plt.figure(figsize=(8, 5))
# Plot the line connecting points
plt.plot(x, y, color='blue', linestyle='-', linewidth=2, label='Connected Line')
# Overlay scatter points
plt.scatter(x, y, color='red', marker='o', s=100, zorder=5, label='Data Points')
plt.title('Scatter Plot with Connected Lines')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.grid(True)
plt.legend()
plt.show()
Basic example of connecting scatter points with a line using plt.plot()
and plt.scatter()
.
zorder
parameter in plt.scatter()
ensures that the scatter points are drawn on top of the line, making them more prominent. A higher zorder
value means the element is drawn later and thus appears on top.Advanced Customization and Ordering
For more control, especially when dealing with unsorted data or specific connection requirements, you might need to sort your data or explicitly define the order of connections. If your data is not inherently ordered (e.g., time series), sorting by the x-axis (or another relevant variable) before plotting the line is crucial to ensure a meaningful connection. You can also customize line styles, colors, and marker types to enhance readability.
import matplotlib.pyplot as plt
import numpy as np
# Sample unsorted data
x_unsorted = np.array([3, 1, 5, 2, 4])
y_unsorted = np.array([30, 10, 50, 20, 40])
# Sort data by x-values for meaningful line connection
sorted_indices = np.argsort(x_unsorted)
x_sorted = x_unsorted[sorted_indices]
y_sorted = y_unsorted[sorted_indices]
plt.figure(figsize=(8, 5))
# Plot the line using sorted data
plt.plot(x_sorted, y_sorted, color='green', linestyle='--', linewidth=1.5, label='Sorted Trend Line')
# Overlay scatter points (using original unsorted data to show their true positions)
plt.scatter(x_unsorted, y_unsorted, color='purple', marker='X', s=120, zorder=5, label='Original Data Points')
plt.title('Scatter Plot with Line (Sorted X-axis)')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.grid(True)
plt.legend()
plt.show()
Connecting points after sorting data to ensure a logical line progression.