Extrapolate with LinearNDInterpolator
Categories:
Beyond Interpolation: Extrapolating Data with SciPy's LinearNDInterpolator

Learn how to leverage SciPy's LinearNDInterpolator
for both interpolation and extrapolation of scattered N-dimensional data, addressing common pitfalls and providing practical Python examples.
When working with scattered data points in N-dimensions, scipy.interpolate.LinearNDInterpolator
is a powerful tool for estimating values at unobserved locations. While its primary purpose is interpolation (estimating values within the convex hull of the data), it can also be adapted for extrapolation (estimating values outside this hull). This article will guide you through using LinearNDInterpolator
for both scenarios, focusing on how to handle extrapolation effectively and understand its limitations.
Understanding LinearNDInterpolator
LinearNDInterpolator
constructs a piecewise linear interpolant over a triangulation of the input data points. For 2D data, this means creating a Delaunay triangulation and then performing linear interpolation within each triangle. For higher dimensions, it uses a similar concept based on simplices. By default, LinearNDInterpolator
returns nan
for points outside the convex hull of the input data, indicating that it does not perform extrapolation by default.
flowchart TD A[Input Data Points (x, y, z)] --> B{Delaunay Triangulation} B --> C{Construct Linear Interpolant} C --> D{Query Point (xq, yq)} D --> E{Is (xq, yq) inside Convex Hull?} E -- Yes --> F[Return Interpolated Value] E -- No --> G[Return NaN (Default Extrapolation Behavior)]
Default behavior of LinearNDInterpolator
Basic Interpolation with LinearNDInterpolator
Let's start with a simple example of how to use LinearNDInterpolator
for its intended purpose: interpolation. We'll generate some 2D scattered data and then interpolate values on a regular grid.
import numpy as np
from scipy.interpolate import LinearNDInterpolator
import matplotlib.pyplot as plt
# 1. Generate scattered data points
np.random.seed(0)
num_points = 50
x = np.random.rand(num_points) * 10
y = np.random.rand(num_points) * 10
z = np.sin(x) + np.cos(y) + np.random.rand(num_points) * 0.5 # Some function with noise
points = np.vstack((x, y)).T
values = z
# 2. Create the interpolator
interp = LinearNDInterpolator(points, values)
# 3. Define a grid for interpolation
grid_x, grid_y = np.mgrid[0:10:100j, 0:10:100j]
grid_points = np.vstack((grid_x.ravel(), grid_y.ravel())).T
# 4. Perform interpolation
interpolated_values = interp(grid_points)
interpolated_values = interpolated_values.reshape(grid_x.shape)
# 5. Plot the results
plt.figure(figsize=(10, 8))
plt.pcolormesh(grid_x, grid_y, interpolated_values, shading='auto', cmap='viridis')
plt.scatter(x, y, c=z, edgecolors='k', s=50, cmap='viridis', label='Original Data')
plt.colorbar(label='Z value')
plt.title('LinearNDInterpolator: Interpolation within Convex Hull')
plt.xlabel('X')
plt.ylabel('Y')
plt.legend()
plt.grid(True)
plt.show()
Basic interpolation using LinearNDInterpolator
Enabling Extrapolation
By default, LinearNDInterpolator
returns nan
for points outside the convex hull. To enable extrapolation, you need to provide a fill_value
argument. This value will be used for any query point that falls outside the convex hull of the input data. While this allows for extrapolation, it's crucial to understand that LinearNDInterpolator
does not perform a sophisticated extrapolation method like trend-based or curve-fitting extrapolation. Instead, it simply assigns the fill_value
to all points outside the hull.
fill_value
for extrapolation with LinearNDInterpolator
is a simple assignment. It does not extend the linear interpolation behavior beyond the convex hull. For more robust extrapolation, consider other methods like scipy.interpolate.RBFInterpolator
with appropriate kernel functions, or fitting a global model to your data.import numpy as np
from scipy.interpolate import LinearNDInterpolator
import matplotlib.pyplot as plt
# 1. Generate scattered data points (same as before)
np.random.seed(0)
num_points = 50
x = np.random.rand(num_points) * 10
y = np.random.rand(num_points) * 10
z = np.sin(x) + np.cos(y) + np.random.rand(num_points) * 0.5
points = np.vstack((x, y)).T
values = z
# 2. Create the interpolator with a fill_value for extrapolation
# Let's use the mean of the data as a simple fill_value for demonstration
mean_z = np.mean(z)
interp_extrap = LinearNDInterpolator(points, values, fill_value=mean_z)
# 3. Define a larger grid for extrapolation
# Extend the grid beyond the original data range (0-10)
grid_x_ext, grid_y_ext = np.mgrid[-2:12:100j, -2:12:100j]
grid_points_ext = np.vstack((grid_x_ext.ravel(), grid_y_ext.ravel())).T
# 4. Perform interpolation and extrapolation
extrapolated_values = interp_extrap(grid_points_ext)
extrapolated_values = extrapolated_values.reshape(grid_x_ext.shape)
# 5. Plot the results
plt.figure(figsize=(10, 8))
plt.pcolormesh(grid_x_ext, grid_y_ext, extrapolated_values, shading='auto', cmap='viridis')
plt.scatter(x, y, c=z, edgecolors='k', s=50, cmap='viridis', label='Original Data')
plt.colorbar(label='Z value')
plt.title(f'LinearNDInterpolator: Extrapolation with fill_value={mean_z:.2f}')
plt.xlabel('X')
plt.ylabel('Y')
plt.legend()
plt.grid(True)
plt.show()
Extrapolation using LinearNDInterpolator with a fill_value
Choosing an Appropriate fill_value
The choice of fill_value
is critical when using LinearNDInterpolator
for extrapolation. A poorly chosen fill_value
can lead to misleading results or visual artifacts. Consider these options:
- Mean/Median of Data: A simple approach, but can create a flat, abrupt boundary.
- Nearest Neighbor Value: While not directly supported by
LinearNDInterpolator
'sfill_value
, you could implement this manually by finding the nearest data point for extrapolated queries. - Constant based on domain knowledge: If you know the expected behavior outside the data range (e.g., values should approach zero, or a specific physical constant).
np.nan
(default): If you want to explicitly mark extrapolated regions as undefined, which is often the safest approach if you don't have strong assumptions about the extrapolated behavior.
scipy.interpolate.RBFInterpolator
. It can extrapolate based on radial basis functions, offering smoother and more physically plausible results in many cases, depending on the chosen kernel.Steps for Effective Extrapolation with LinearNDInterpolator
While LinearNDInterpolator
's fill_value
is a basic form of extrapolation, here's a structured approach to using it:
1. Prepare Your Data
Organize your N-dimensional coordinates into a (N_points, N_dimensions)
NumPy array and your corresponding values into a (N_points,)
array.
2. Instantiate LinearNDInterpolator
Create an instance of LinearNDInterpolator
with your points
and values
. Crucially, specify a fill_value
that makes sense for your application. If you're unsure, np.nan
is a safe default to identify extrapolated regions.
3. Define Query Points
Generate the N-dimensional coordinates where you want to estimate values. For extrapolation, ensure these points extend beyond the convex hull of your original data.
4. Perform Query
Call the interpolator instance with your query points. It will return interpolated values for points within the convex hull and the fill_value
for points outside.
5. Analyze and Visualize Results
Carefully examine the output, especially in the extrapolated regions. Visualize the results to understand the impact of your chosen fill_value
and the extent of extrapolation.