How do I split a string in Java?
Categories:
Mastering String Splitting in Java: A Comprehensive Guide

Learn how to effectively split strings in Java using various methods, including split()
, StringTokenizer
, and regular expressions, for robust data parsing.
Splitting a string is a fundamental operation in programming, allowing you to break down a larger string into an array of smaller substrings based on a specified delimiter. In Java, there are several powerful ways to achieve this, each with its own advantages and use cases. This article will guide you through the most common and effective methods, helping you choose the right tool for your specific needs.
Using the String.split()
Method
The split()
method of the String
class is the most common and often the simplest way to split a string in Java. It takes a regular expression as its argument, which defines the delimiter. The method returns an array of strings, where each element is a substring of the original string, separated by the delimiter.
String sentence = "Hello,world,Java,programming";
String[] words = sentence.split(",");
for (String word : words) {
System.out.println(word);
}
// Output:
// Hello
// world
// Java
// programming
Basic usage of String.split()
with a comma delimiter.
split()
, remember that the delimiter is a regular expression. If your delimiter contains special regex characters (e.g., .
, |
, *
, +
, ?
, ^
, $
, \
, {}
, []
, ()
), you'll need to escape them with a backslash (\
). For example, to split by a dot, use \.
, or use Pattern.quote()
for automatic escaping.Handling Empty Strings and Limit Parameter
The split()
method also has an overloaded version that accepts a limit
parameter. This parameter controls the number of times the pattern is applied and thus affects the length of the resulting array. If the limit is positive, the pattern will be applied at most limit - 1
times, and the array's length will be no greater than limit
. If the limit is zero or negative, the pattern will be applied as many times as possible, and the array can be of any length.
String data = "apple,,banana,orange";
// No limit (default behavior, trailing empty strings are discarded)
String[] parts1 = data.split(",");
System.out.println("Parts (no limit): " + Arrays.toString(parts1));
// Output: Parts (no limit): [apple, , banana, orange]
// Limit = 2 (splits at most once)
String[] parts2 = data.split(",", 2);
System.out.println("Parts (limit 2): " + Arrays.toString(parts2));
// Output: Parts (limit 2): [apple, ,banana,orange]
// Limit = -1 (keeps all empty strings, including trailing ones)
String[] parts3 = data.split(",", -1);
System.out.println("Parts (limit -1): " + Arrays.toString(parts3));
// Output: Parts (limit -1): [apple, , banana, orange]
String emptyTrailing = "a,b,c,";
String[] parts4 = emptyTrailing.split(",");
System.out.println("Parts (trailing empty, no limit): " + Arrays.toString(parts4));
// Output: Parts (trailing empty, no limit): [a, b, c]
String[] parts5 = emptyTrailing.split(",", -1);
System.out.println("Parts (trailing empty, limit -1): " + Arrays.toString(parts5));
// Output: Parts (trailing empty, limit -1): [a, b, c, ]
Demonstrating the limit
parameter and handling of empty strings.
flowchart TD A[Input String] --> B{Delimiter?} B -- Yes --> C[Split Point] C --> D[Substring] D --> E{More Delimiters?} E -- Yes --> C E -- No --> F[End of String] F --> G[Array of Substrings] B -- No --> G
Conceptual flow of the String.split()
method.
Splitting with StringTokenizer
(Legacy Approach)
The StringTokenizer
class is a legacy class from Java's early days, primarily used for breaking strings into tokens. While it still works, it's generally recommended to use String.split()
or regular expressions for new code due to StringTokenizer
's limitations (e.g., it doesn't support regular expressions and cannot return empty strings as tokens). However, you might encounter it in older codebases.
import java.util.StringTokenizer;
String text = "one two three";
StringTokenizer tokenizer = new StringTokenizer(text, " ");
while (tokenizer.hasMoreTokens()) {
System.out.println(tokenizer.nextToken());
}
// Output:
// one
// two
// three
Example of using StringTokenizer
to split a string by space.
StringTokenizer
for new development. It's considered a legacy class and lacks the flexibility and power of regular expressions provided by String.split()
and the java.util.regex
package. It also doesn't handle empty tokens gracefully.Advanced Splitting with Pattern
and Matcher
For more complex splitting scenarios, especially when you need fine-grained control over the regular expression or want to reuse a compiled pattern, the java.util.regex.Pattern
class offers a more powerful approach. You can compile a Pattern
and then use its split()
method, which behaves similarly to String.split()
but allows for pre-compilation of the regex.
import java.util.regex.Pattern;
import java.util.Arrays;
String logEntry = "ERROR: File not found. Code: 404. Timestamp: 2023-10-27";
// Split by any of ": ", ". ", or ", "
Pattern pattern = Pattern.compile(": |\. |\, ");
String[] parts = pattern.split(logEntry);
System.out.println(Arrays.toString(parts));
// Output: [ERROR, File not found, Code, 404, Timestamp, 2023-10-27]
Using Pattern.split()
for more complex regular expression delimiters.
Pattern.compile()
is more efficient if you need to split many strings using the same regular expression, as the pattern is compiled only once. For single-use splits, String.split()
is often sufficient and more concise.