Probability to z-score and vice versa
Categories:
Mastering Z-Scores: From Probability to Standard Deviations and Back

Unlock the power of the standard normal distribution by learning how to convert probabilities to Z-scores and Z-scores back to probabilities using Python's SciPy library.
In statistics, the Z-score (also called a standard score) is a fundamental concept that measures how many standard deviations an element is from the mean. It's a powerful tool for standardizing data, allowing for comparisons across different datasets. Understanding how to convert between probabilities and Z-scores is crucial for hypothesis testing, confidence intervals, and various data analysis tasks. This article will guide you through these conversions using Python, focusing on the scipy.stats
module.
What is a Z-Score?
A Z-score tells you where your data point stands in relation to the mean of a normal distribution. A positive Z-score indicates the data point is above the mean, while a negative Z-score indicates it's below the mean. A Z-score of 0 means the data point is exactly at the mean. The formula for calculating a Z-score for a single data point x
from a population with mean μ
and standard deviation σ
is:
Z = (x - μ) / σ
However, when we talk about converting probabilities to Z-scores, we're often referring to the inverse cumulative distribution function (CDF) of the standard normal distribution, which has a mean of 0 and a standard deviation of 1. This function, often denoted as Φ⁻¹(p)
, gives you the Z-score below which a given probability p
lies.
flowchart TD A["Raw Data Point (x)"] --> B["Mean (μ) & Std Dev (σ)"] B --> C{"Calculate Z-Score: (x - μ) / σ"} C --> D["Z-Score"] D --> E["Standard Normal Distribution"] E --> F{"Look up Probability (CDF)"} F --> G["Probability (P)"] G --> H{"Inverse CDF (PPF)"} H --> D
Relationship between Raw Data, Z-Scores, and Probabilities
Converting Probability to Z-Score (Inverse CDF)
To find the Z-score corresponding to a given cumulative probability, we use the Percent Point Function (PPF), which is the inverse of the Cumulative Distribution Function (CDF). In Python, scipy.stats.norm.ppf()
is the function we need. It takes a probability (a value between 0 and 1) and returns the Z-score below which that probability occurs.
from scipy.stats import norm
# Probability for a one-tailed test (e.g., 95% confidence level)
probability_one_tail = 0.95
z_score_one_tail = norm.ppf(probability_one_tail)
print(f"Z-score for {probability_one_tail*100}% probability (one-tail): {z_score_one_tail:.4f}")
# Probability for a two-tailed test (e.g., 95% confidence level)
# For a 95% confidence interval, we need 2.5% in each tail.
# So, the cumulative probability for the upper bound is 1 - 0.025 = 0.975
probability_two_tail_upper = 1 - (0.05 / 2) # For 95% CI, alpha=0.05, alpha/2 = 0.025
z_score_two_tail_upper = norm.ppf(probability_two_tail_upper)
print(f"Z-score for {probability_two_tail_upper*100}% probability (two-tail upper): {z_score_two_tail_upper:.4f}")
# Z-score for the lower bound of a two-tailed test
probability_two_tail_lower = 0.025
z_score_two_tail_lower = norm.ppf(probability_two_tail_lower)
print(f"Z-score for {probability_two_tail_lower*100}% probability (two-tail lower): {z_score_two_tail_lower:.4f}")
Using norm.ppf()
to convert probabilities to Z-scores.
norm.ppf(p)
returns the Z-score such that the area to its left under the standard normal curve is p
. For two-tailed tests, you often need to consider 1 - (alpha / 2)
for the upper critical value.Converting Z-Score to Probability (CDF)
To find the cumulative probability associated with a given Z-score, we use the Cumulative Distribution Function (CDF). In Python, scipy.stats.norm.cdf()
is the function for this. It takes a Z-score and returns the probability that a randomly selected value from a standard normal distribution will be less than or equal to that Z-score.
from scipy.stats import norm
# Example Z-score
z_score = 1.96
# Probability that a value is less than or equal to the Z-score
probability_less_than = norm.cdf(z_score)
print(f"Probability for Z-score <= {z_score}: {probability_less_than:.4f}")
# Probability that a value is greater than the Z-score
probability_greater_than = 1 - norm.cdf(z_score)
print(f"Probability for Z-score > {z_score}: {probability_greater_than:.4f}")
# Probability between two Z-scores (e.g., -1.96 and 1.96)
z_score_lower = -1.96
z_score_upper = 1.96
probability_between = norm.cdf(z_score_upper) - norm.cdf(z_score_lower)
print(f"Probability between Z-scores {z_score_lower} and {z_score_upper}: {probability_between:.4f}")
Using norm.cdf()
to convert Z-scores to probabilities.
norm.cdf(z)
function always returns the cumulative probability from the far left tail up to the given Z-score. To find the probability of a value being greater than a Z-score, subtract norm.cdf(z)
from 1.