Plot normal, left and right skewed distribution in R
Categories:
Visualizing Skewed Distributions in R: Normal, Left, and Right Skew

Learn how to generate and plot normal, left-skewed, and right-skewed distributions in R using histograms and density plots, essential for statistical analysis.
Understanding the shape of your data's distribution is a fundamental step in statistical analysis. Distributions can be symmetric, like the normal distribution, or asymmetric, exhibiting skewness. Skewness indicates the degree of asymmetry of a distribution around its mean. A positive skew (right-skewed) means the tail on the right side is longer or fatter, while a negative skew (left-skewed) means the tail on the left side is longer or fatter. This article will guide you through generating and visualizing these three common types of distributions in R.
Understanding Distribution Skewness
Before diving into the R code, let's briefly define what each type of distribution signifies:
Normal Distribution (Symmetric): Often called the bell curve, it's perfectly symmetrical around its mean. The mean, median, and mode are all equal. Many natural phenomena follow a normal distribution.
Right-Skewed Distribution (Positive Skew): The tail extends to the right, and the majority of the data points (mode and median) are concentrated on the left side. The mean is typically greater than the median. Examples include income distribution or housing prices.
Left-Skewed Distribution (Negative Skew): The tail extends to the left, and the majority of the data points are concentrated on the right side. The mean is typically less than the median. Examples might include exam scores where most students perform well, or the age of death in a developed country.
flowchart TD A[Start: Data Collection] --> B{Assess Distribution Shape} B -->|Symmetric| C[Normal Distribution] B -->|Tail Right| D[Right-Skewed Distribution] B -->|Tail Left| E[Left-Skewed Distribution] C --> F[Mean â Median â Mode] D --> G[Mean > Median > Mode] E --> H[Mean < Median < Mode] F --> I[Visualize with Histogram/Density Plot] G --> I H --> I I --> J[End: Interpret Skewness]
Flowchart illustrating the identification and characteristics of different distribution types.
Generating and Plotting Distributions in R
R provides powerful tools for generating random data from various distributions and visualizing them. We'll use base R plotting functions (hist()
and density()
) for simplicity, but ggplot2
can also be used for more advanced visualizations.
set.seed()
before calling random number generation functions like rnorm()
, rbeta()
, etc.# Set a seed for reproducibility
set.seed(123)
# 1. Normal Distribution
normal_data <- rnorm(1000, mean = 0, sd = 1)
# 2. Right-Skewed Distribution (e.g., using Beta distribution with alpha < beta)
# A common way to get right skew is to have a higher concentration at lower values
right_skewed_data <- rbeta(1000, shape1 = 2, shape2 = 5)
# 3. Left-Skewed Distribution (e.g., using Beta distribution with alpha > beta)
# A common way to get left skew is to have a higher concentration at higher values
left_skewed_data <- rbeta(1000, shape1 = 5, shape2 = 2)
# Plotting all three distributions
par(mfrow = c(1, 3)) # Arrange plots in 1 row, 3 columns
# Normal Distribution Plot
hist(normal_data,
main = "Normal Distribution",
xlab = "Value",
col = "skyblue",
border = "white",
freq = FALSE) # freq=FALSE for density
lines(density(normal_data), col = "blue", lwd = 2)
# Right-Skewed Distribution Plot
hist(right_skewed_data,
main = "Right-Skewed Distribution",
xlab = "Value",
col = "lightcoral",
border = "white",
freq = FALSE)
lines(density(right_skewed_data), col = "red", lwd = 2)
# Left-Skewed Distribution Plot
hist(left_skewed_data,
main = "Left-Skewed Distribution",
xlab = "Value",
col = "lightgreen",
border = "white",
freq = FALSE)
lines(density(left_skewed_data), col = "darkgreen", lwd = 2)
par(mfrow = c(1, 1)) # Reset plot layout
R code to generate and plot normal, right-skewed, and left-skewed distributions.
Interpreting the Plots
After running the code, you will see three plots side-by-side:
Normal Distribution: The histogram will show a classic bell shape, with the highest bars in the center and tapering off symmetrically on both sides. The density curve will perfectly overlay this symmetric shape.
Right-Skewed Distribution: The histogram will have its peak (mode) towards the left, and a long 'tail' extending towards the right. The density curve will follow this pattern, indicating that the mean is pulled to the right by the higher values in the tail.
Left-Skewed Distribution: The histogram will have its peak (mode) towards the right, and a long 'tail' extending towards the left. The density curve will confirm this, showing the mean being pulled to the left by the lower values in the tail.
The choice of shape1
and shape2
parameters in the rbeta()
function is crucial for controlling the skewness. For rbeta(n, shape1, shape2)
:
- If
shape1 < shape2
, the distribution tends to be right-skewed. - If
shape1 > shape2
, the distribution tends to be left-skewed. - If
shape1 = shape2
, the distribution is symmetric (e.g.,shape1 = 2, shape2 = 2
gives a symmetric, unimodal distribution).
Conclusion
Visualizing data distributions is a critical skill for any data analyst or statistician. By understanding and being able to plot normal, right-skewed, and left-skewed distributions, you gain insights into the underlying characteristics of your data, which can inform your choice of statistical tests and modeling approaches. R provides a straightforward and effective way to achieve these visualizations, helping you to communicate your data's story more effectively.