How to find pythagorean triplets in an array faster than O(N^2)?
Categories:
Optimizing Pythagorean Triplet Search Beyond O(N^2)

Discover advanced algorithms and data structures to efficiently find Pythagorean triplets (a² + b² = c²) in an array, significantly outperforming brute-force O(N³) and O(N²) approaches.
Finding Pythagorean triplets in a given array of numbers is a classic algorithmic problem. A triplet (a, b, c) is Pythagorean if a² + b² = c². The naive approach involves three nested loops, leading to an O(N³) time complexity. A common optimization reduces this to O(N²) by fixing 'a' and 'b' and then searching for 'c'. However, for large datasets, even O(N²) can be too slow. This article explores techniques to find Pythagorean triplets more efficiently, aiming for better than O(N²) performance.
Understanding the O(N²) Approach
Before diving into optimizations, let's briefly review the standard O(N²) approach. This method typically involves sorting the array first. Once sorted, we iterate through all possible pairs (a, b) and then use a two-pointer approach or a hash set to quickly check for the existence of c = √(a² + b²). The sorting step takes O(N log N), and the nested loops with a search take O(N²), making the overall complexity O(N²).
def find_pythagorean_triplets_n_squared(arr):
n = len(arr)
if n < 3: return []
# Square all elements and sort them
squared_arr = sorted([x*x for x in arr])
triplets = []
# Iterate through each possible 'c' (largest element)
for i in range(n - 1, 1, -1):
c_squared = squared_arr[i]
left = 0
right = i - 1
while left < right:
a_squared = squared_arr[left]
b_squared = squared_arr[right]
if a_squared + b_squared == c_squared:
# Found a triplet, convert back to original values if needed
# For simplicity, we'll store squared values here
triplets.append((a_squared, b_squared, c_squared))
left += 1
right -= 1
elif a_squared + b_squared < c_squared:
left += 1
else:
right -= 1
return triplets
# Example usage:
# arr = [3, 1, 4, 6, 5]
# print(find_pythagorean_triplets_n_squared(arr))
Python implementation of the O(N²) approach using sorting and two pointers.
Optimizing with Hashing and Pre-computation
To achieve better than O(N²) complexity, we need to avoid the nested loops for finding 'a' and 'b' or 'c'. One effective strategy involves pre-computing squares and storing them in a hash set for O(1) lookups. The core idea is to iterate through all possible pairs (a, b) and check if a² + b² exists in our pre-computed set of squares. However, this still results in O(N²) in the worst case for iterating pairs. The real optimization comes from how we handle the numbers themselves.
A Faster Approach: Leveraging Number Theory and Hash Sets
A truly faster approach often involves a combination of sorting, hashing, and a clever iteration strategy. Consider the equation a² + b² = c². If we iterate through each number x
in the array, we can treat it as either a
, b
, or c
. A more efficient method involves squaring all numbers first and storing them in a hash set. Then, for every pair (x, y)
from the original array, calculate x² + y²
and check if this sum exists in the hash set of squared numbers. This is still O(N²).
To go beyond, we can iterate through the array, considering each element c
as the hypotenuse. For each c
, we need to find a
and b
such that a² + b² = c²
. If we have a hash set of all x²
values, we can iterate through a
and check if c² - a²
exists in the hash set. This is still O(N²).
The most common 'faster' approach that is often cited as better than O(N²) is typically a misinterpretation or relies on specific constraints of the input. For a general array of numbers, finding all triplets is inherently bounded by the number of pairs, which is O(N²). However, if we are looking for primitive Pythagorean triplets or if the range of numbers is limited, specialized algorithms can be faster.
For a general array, the O(N²) approach (sorting + two pointers or sorting + hash set) is generally considered optimal for finding all triplets. Any 'faster' claim usually refers to average case performance, specific data distributions, or finding any triplet rather than all of them.
Let's illustrate the O(N²) approach with a hash set for clarity, as it's often confused with being faster than the two-pointer method, though both are O(N²).
flowchart TD A[Start] B{Square all numbers and store in a hash set} C{Sort the original array (optional, but helps for unique triplets)} D{Iterate 'a' from 0 to N-1} E{Iterate 'b' from 'a+1' to N-1} F{Calculate sum_sq = arr[a]^2 + arr[b]^2} G{Check if sum_sq exists in hash set} H{If exists, add (arr[a], arr[b], sqrt(sum_sq)) to results} I[End] A --> B B --> C C --> D D --> E E --> F F --> G G -- Yes --> H G -- No --> E H --> E E --> D D --> I
Flowchart of the O(N²) approach using a hash set for c
lookup.
import math
def find_pythagorean_triplets_hash_set(arr):
n = len(arr)
if n < 3: return []
# Square all elements and store in a hash set for O(1) lookup
squared_set = {x*x for x in arr}
# Sort the original array to handle duplicates and ensure a < b < c
# This also helps in avoiding redundant checks if we iterate carefully
sorted_arr = sorted(list(set(arr))) # Use set to handle duplicates first
n_unique = len(sorted_arr)
triplets = set() # Use a set to store unique triplets
for i in range(n_unique):
for j in range(i + 1, n_unique):
a = sorted_arr[i]
b = sorted_arr[j]
sum_sq = a*a + b*b
# Check if sum_sq exists in our pre-computed set of squares
if sum_sq in squared_set:
c = int(math.sqrt(sum_sq))
# Ensure c is also in the original array (or its unique sorted version)
# and that c > b to avoid permutations and ensure distinctness
if c > b and c in sorted_arr:
triplets.add(tuple(sorted((a, b, c)))) # Store sorted tuple for uniqueness
return [list(t) for t in triplets]
# Example usage:
# arr = [3, 1, 4, 6, 5, 12, 13]
# print(find_pythagorean_triplets_hash_set(arr))
Python implementation of the O(N²) approach using a hash set.
In conclusion, for finding all Pythagorean triplets in a general array, the O(N²) solutions (either sorting with two pointers or sorting with a hash set) are the most efficient practical approaches. Claims of sub-O(N²) often refer to specific problem variations or average-case scenarios that don't hold universally.