How do I remove repeated elements from ArrayList?

Learn how do i remove repeated elements from arraylist? with practical examples, diagrams, and best practices. Covers java, list, collections development techniques with visual explanations.

Efficiently Remove Duplicate Elements from an ArrayList in Java

Hero image for How do I remove repeated elements from ArrayList?

Learn various techniques to eliminate repeated elements from Java ArrayLists, improving data integrity and performance. This guide covers methods using HashSet, LinkedHashSet, Java 8 Streams, and traditional loops.

ArrayLists are dynamic arrays in Java that allow duplicate elements by default. However, in many scenarios, you might need to ensure that your list contains only unique values. Removing duplicates is a common task in data processing, data cleaning, and preparing data for further operations. This article explores several effective methods to achieve this, ranging from simple set-based approaches to more modern Java 8 Stream API solutions.

Understanding the Problem: Duplicates in ArrayList

Before diving into solutions, it's important to visualize how duplicates can exist and why their removal is necessary. Consider an ArrayList storing user IDs, product names, or any other data where each entry should ideally be unique. Duplicates can lead to incorrect calculations, redundant processing, or skewed data analysis. The choice of method often depends on factors like preserving order, performance requirements, and Java version compatibility.

flowchart TD
    A[Original ArrayList] --> B{Contains Duplicates?}
    B -- Yes --> C[Identify Duplicates]
    C --> D[Remove Duplicates]
    D --> E[Unique ArrayList]
    B -- No --> E

Flowchart illustrating the process of handling duplicates in an ArrayList.

Method 1: Using HashSet for Unordered Unique Elements

The HashSet class in Java stores only unique elements and does not maintain insertion order. This makes it an excellent choice for quickly removing duplicates if the order of elements is not important. The process involves adding all elements from the ArrayList to a HashSet, which automatically handles uniqueness, and then converting the HashSet back into an ArrayList.

import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;
import java.util.Set;

public class RemoveDuplicates {
    public static void main(String[] args) {
        ArrayList<String> listWithDuplicates = new ArrayList<>();
        listWithDuplicates.add("Apple");
        listWithDuplicates.add("Banana");
        listWithDuplicates.add("Apple");
        listWithDuplicates.add("Orange");
        listWithDuplicates.add("Banana");

        System.out.println("Original ArrayList: " + listWithDuplicates);

        // 1. Create a HashSet from the ArrayList
        Set<String> uniqueElements = new HashSet<>(listWithDuplicates);

        // 2. Clear the original ArrayList
        listWithDuplicates.clear();

        // 3. Add all unique elements back to the ArrayList
        listWithDuplicates.addAll(uniqueElements);

        System.out.println("ArrayList after removing duplicates (HashSet): " + listWithDuplicates);
    }
}

Method 2: Using LinkedHashSet for Ordered Unique Elements

If you need to remove duplicates while preserving the original insertion order of the elements, LinkedHashSet is the ideal choice. LinkedHashSet extends HashSet but maintains a doubly-linked list running through its entries, ensuring that iteration order is the order in which elements were inserted. The approach is similar to using HashSet.

import java.util.ArrayList;
import java.util.LinkedHashSet;
import java.util.List;
import java.util.Set;

public class RemoveDuplicatesOrdered {
    public static void main(String[] args) {
        ArrayList<String> listWithDuplicates = new ArrayList<>();
        listWithDuplicates.add("Apple");
        listWithDuplicates.add("Banana");
        listWithDuplicates.add("Apple");
        listWithDuplicates.add("Orange");
        listWithDuplicates.add("Banana");

        System.out.println("Original ArrayList: " + listWithDuplicates);

        // 1. Create a LinkedHashSet from the ArrayList
        Set<String> uniqueElements = new LinkedHashSet<>(listWithDuplicates);

        // 2. Clear the original ArrayList
        listWithDuplicates.clear();

        // 3. Add all unique elements back to the ArrayList
        listWithDuplicates.addAll(uniqueElements);

        System.out.println("ArrayList after removing duplicates (LinkedHashSet): " + listWithDuplicates);
    }
}

Method 3: Using Java 8 Streams (Concise and Modern)

For Java 8 and later, the Stream API provides a very concise and functional way to remove duplicates. The distinct() method of a stream returns a stream consisting of the distinct elements (according to Object.equals(Object)) of this stream. This method also preserves the original order of elements.

import java.util.ArrayList;
import java.util.List;
import java.util.stream.Collectors;

public class RemoveDuplicatesStream {
    public static void main(String[] args) {
        ArrayList<String> listWithDuplicates = new ArrayList<>();
        listWithDuplicates.add("Apple");
        listWithDuplicates.add("Banana");
        listWithDuplicates.add("Apple");
        listWithDuplicates.add("Orange");
        listWithDuplicates.add("Banana");

        System.out.println("Original ArrayList: " + listWithDuplicates);

        // Use Stream API to get distinct elements and collect them into a new ArrayList
        List<String> listWithoutDuplicates = listWithDuplicates.stream()
                                                .distinct()
                                                .collect(Collectors.toCollection(ArrayList::new));

        System.out.println("ArrayList after removing duplicates (Java 8 Stream): " + listWithoutDuplicates);
    }
}

Method 4: Manual Iteration (Less Efficient, but Fundamental)

While less efficient for large lists compared to Set-based approaches, understanding how to remove duplicates using manual iteration is fundamental. This method typically involves iterating through the original list and adding elements to a new list only if they haven't been added before. This approach also preserves order.

import java.util.ArrayList;
import java.util.List;

public class RemoveDuplicatesManual {
    public static void main(String[] args) {
        ArrayList<String> listWithDuplicates = new ArrayList<>();
        listWithDuplicates.add("Apple");
        listWithDuplicates.add("Banana");
        listWithDuplicates.add("Apple");
        listWithDuplicates.add("Orange");
        listWithDuplicates.add("Banana");

        System.out.println("Original ArrayList: " + listWithDuplicates);

        ArrayList<String> listWithoutDuplicates = new ArrayList<>();

        for (String element : listWithDuplicates) {
            if (!listWithoutDuplicates.contains(element)) {
                listWithoutDuplicates.add(element);
            }
        }

        System.out.println("ArrayList after removing duplicates (Manual Iteration): " + listWithoutDuplicates);
    }
}