How do I remove repeated elements from ArrayList?
Categories:
Efficiently Remove Duplicate Elements from an ArrayList in Java

Learn various techniques to eliminate repeated elements from Java ArrayLists, improving data integrity and performance. This guide covers methods using HashSet
, LinkedHashSet
, Java 8 Streams, and traditional loops.
ArrayLists are dynamic arrays in Java that allow duplicate elements by default. However, in many scenarios, you might need to ensure that your list contains only unique values. Removing duplicates is a common task in data processing, data cleaning, and preparing data for further operations. This article explores several effective methods to achieve this, ranging from simple set-based approaches to more modern Java 8 Stream API solutions.
Understanding the Problem: Duplicates in ArrayList
Before diving into solutions, it's important to visualize how duplicates can exist and why their removal is necessary. Consider an ArrayList
storing user IDs, product names, or any other data where each entry should ideally be unique. Duplicates can lead to incorrect calculations, redundant processing, or skewed data analysis. The choice of method often depends on factors like preserving order, performance requirements, and Java version compatibility.
flowchart TD A[Original ArrayList] --> B{Contains Duplicates?} B -- Yes --> C[Identify Duplicates] C --> D[Remove Duplicates] D --> E[Unique ArrayList] B -- No --> E
Flowchart illustrating the process of handling duplicates in an ArrayList.
Method 1: Using HashSet
for Unordered Unique Elements
The HashSet
class in Java stores only unique elements and does not maintain insertion order. This makes it an excellent choice for quickly removing duplicates if the order of elements is not important. The process involves adding all elements from the ArrayList
to a HashSet
, which automatically handles uniqueness, and then converting the HashSet
back into an ArrayList
.
import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;
import java.util.Set;
public class RemoveDuplicates {
public static void main(String[] args) {
ArrayList<String> listWithDuplicates = new ArrayList<>();
listWithDuplicates.add("Apple");
listWithDuplicates.add("Banana");
listWithDuplicates.add("Apple");
listWithDuplicates.add("Orange");
listWithDuplicates.add("Banana");
System.out.println("Original ArrayList: " + listWithDuplicates);
// 1. Create a HashSet from the ArrayList
Set<String> uniqueElements = new HashSet<>(listWithDuplicates);
// 2. Clear the original ArrayList
listWithDuplicates.clear();
// 3. Add all unique elements back to the ArrayList
listWithDuplicates.addAll(uniqueElements);
System.out.println("ArrayList after removing duplicates (HashSet): " + listWithDuplicates);
}
}
HashSet
offers O(1) average time complexity for add operations.Method 2: Using LinkedHashSet
for Ordered Unique Elements
If you need to remove duplicates while preserving the original insertion order of the elements, LinkedHashSet
is the ideal choice. LinkedHashSet
extends HashSet
but maintains a doubly-linked list running through its entries, ensuring that iteration order is the order in which elements were inserted. The approach is similar to using HashSet
.
import java.util.ArrayList;
import java.util.LinkedHashSet;
import java.util.List;
import java.util.Set;
public class RemoveDuplicatesOrdered {
public static void main(String[] args) {
ArrayList<String> listWithDuplicates = new ArrayList<>();
listWithDuplicates.add("Apple");
listWithDuplicates.add("Banana");
listWithDuplicates.add("Apple");
listWithDuplicates.add("Orange");
listWithDuplicates.add("Banana");
System.out.println("Original ArrayList: " + listWithDuplicates);
// 1. Create a LinkedHashSet from the ArrayList
Set<String> uniqueElements = new LinkedHashSet<>(listWithDuplicates);
// 2. Clear the original ArrayList
listWithDuplicates.clear();
// 3. Add all unique elements back to the ArrayList
listWithDuplicates.addAll(uniqueElements);
System.out.println("ArrayList after removing duplicates (LinkedHashSet): " + listWithDuplicates);
}
}
LinkedHashSet
preserves order, it might be slightly slower than HashSet
due to the overhead of maintaining the linked list, though still very efficient for most use cases.Method 3: Using Java 8 Streams (Concise and Modern)
For Java 8 and later, the Stream API provides a very concise and functional way to remove duplicates. The distinct()
method of a stream returns a stream consisting of the distinct elements (according to Object.equals(Object)
) of this stream. This method also preserves the original order of elements.
import java.util.ArrayList;
import java.util.List;
import java.util.stream.Collectors;
public class RemoveDuplicatesStream {
public static void main(String[] args) {
ArrayList<String> listWithDuplicates = new ArrayList<>();
listWithDuplicates.add("Apple");
listWithDuplicates.add("Banana");
listWithDuplicates.add("Apple");
listWithDuplicates.add("Orange");
listWithDuplicates.add("Banana");
System.out.println("Original ArrayList: " + listWithDuplicates);
// Use Stream API to get distinct elements and collect them into a new ArrayList
List<String> listWithoutDuplicates = listWithDuplicates.stream()
.distinct()
.collect(Collectors.toCollection(ArrayList::new));
System.out.println("ArrayList after removing duplicates (Java 8 Stream): " + listWithoutDuplicates);
}
}
distinct()
method relies on the equals()
and hashCode()
methods of the objects in the list. Ensure these methods are correctly implemented for custom objects to guarantee proper duplicate detection.Method 4: Manual Iteration (Less Efficient, but Fundamental)
While less efficient for large lists compared to Set-based approaches, understanding how to remove duplicates using manual iteration is fundamental. This method typically involves iterating through the original list and adding elements to a new list only if they haven't been added before. This approach also preserves order.
import java.util.ArrayList;
import java.util.List;
public class RemoveDuplicatesManual {
public static void main(String[] args) {
ArrayList<String> listWithDuplicates = new ArrayList<>();
listWithDuplicates.add("Apple");
listWithDuplicates.add("Banana");
listWithDuplicates.add("Apple");
listWithDuplicates.add("Orange");
listWithDuplicates.add("Banana");
System.out.println("Original ArrayList: " + listWithDuplicates);
ArrayList<String> listWithoutDuplicates = new ArrayList<>();
for (String element : listWithDuplicates) {
if (!listWithoutDuplicates.contains(element)) {
listWithoutDuplicates.add(element);
}
}
System.out.println("ArrayList after removing duplicates (Manual Iteration): " + listWithoutDuplicates);
}
}
contains()
method on ArrayList
), making it unsuitable for very large lists. Prefer Set-based or Stream API solutions for better performance.