Creating a histogram with distribution curve, where the curve series is larger than the bin series
Categories:
Creating Histograms with Overlapping Distribution Curves in Highcharts

Learn how to visualize data distributions effectively by combining a histogram with a smooth distribution curve in Highcharts, even when the curve data is more granular than the histogram bins.
Histograms are powerful tools for visualizing the distribution of a dataset. When combined with a distribution curve, they provide an even richer insight into the underlying probability density function. This article will guide you through creating such a chart using Highcharts, specifically addressing the common scenario where your distribution curve data might be more detailed (have more points) than your histogram bins.
Understanding the Challenge: Mismatched Data Granularity
Typically, a histogram groups data into 'bins', representing frequency counts within specific ranges. A distribution curve, on the other hand, often requires a smoother, more continuous representation, which means it might be generated from a larger number of data points or a mathematical function. The challenge arises when you want to overlay these two series on the same chart, as their X-axis data points (or categories) might not align perfectly. Highcharts provides flexible ways to handle this, primarily by using different series types and ensuring proper X-axis mapping.
flowchart TD A[Raw Data] --> B{Bin Data for Histogram} A --> C{Generate Smoother Data for Curve} B --> D[Highcharts Histogram Series] C --> E[Highcharts Spline/Area Series] D & E --> F[Combined Chart with Shared X-Axis]
Data flow for creating a histogram with an overlaid distribution curve.
Setting Up the Highcharts Configuration
To achieve our goal, we'll use two distinct series types: a column
series for the histogram and a spline
or area
series for the distribution curve. The key is to ensure both series share the same X-axis and that their data is correctly formatted. The histogram data will typically be an array of [bin_start, count]
or [category, count]
, while the curve data will be [x_value, y_value]
pairs, where x_value
can be more granular.
Highcharts.chart('container', {
title: {
text: 'Histogram with Distribution Curve'
},
xAxis: {
title: {
text: 'Value'
}
},
yAxis: {
title: {
text: 'Frequency / Density'
}
},
series: [{
name: 'Histogram',
type: 'column',
data: [
[0, 5], [1, 10], [2, 15], [3, 20], [4, 12], [5, 8]
],
pointPadding: 0,
groupPadding: 0,
borderWidth: 0
}, {
name: 'Distribution Curve',
type: 'spline',
data: [
[0, 2], [0.5, 7], [1, 12], [1.5, 17], [2, 20], [2.5, 18], [3, 14], [3.5, 10], [4, 6], [4.5, 4], [5, 2]
],
marker: {
enabled: false
},
lineWidth: 2,
color: Highcharts.getOptions().colors[1] // Use a different color
}]
});
Basic Highcharts configuration for a histogram with an overlaid spline curve.
pointPadding: 0
and groupPadding: 0
ensures that the columns touch, which is characteristic of a true histogram. borderWidth: 0
can also improve the visual continuity.Generating Data for the Distribution Curve
The distribution curve often represents a theoretical probability density function (e.g., Normal, Poisson) or a smoothed empirical distribution. If you have raw data, you might use kernel density estimation (KDE) to generate the curve. For this example, we'll assume you have a set of [x, y]
pairs for your curve. The x
values for the curve should ideally span the range of your histogram bins and be more numerous to create a smooth appearance.
// Example of generating more granular data for a curve
function generateCurveData(min, max, numPoints, func) {
const data = [];
const step = (max - min) / (numPoints - 1);
for (let i = 0; i < numPoints; i++) {
const x = min + i * step;
data.push([x, func(x)]);
}
return data;
}
// A simple example function (e.g., a bell curve approximation)
const bellCurve = (x) => {
const mean = 2.5;
const stdDev = 1.0;
return 25 * Math.exp(-0.5 * Math.pow((x - mean) / stdDev, 2));
};
const curveData = generateCurveData(0, 5, 100, bellCurve);
// curveData can then be used in the spline series.
JavaScript function to generate granular data points for a smooth curve.