bash wildcard n digits
Categories:
Mastering Bash Wildcards for N-Digit Matching
Learn how to effectively use Bash wildcards and regular expressions to match filenames and strings containing a specific number of digits.
Bash wildcards are powerful tools for pattern matching in the shell, primarily used for filename expansion. While standard wildcards like *
(matches zero or more characters) and ?
(matches exactly one character) are versatile, they don't directly support matching a precise number of digits. This article explores how to achieve N-digit matching using a combination of Bash's extended globbing features and regular expressions, providing practical examples for common scenarios.
Understanding Bash Wildcards and Globbing
Bash's default wildcard behavior, known as globbing, is distinct from regular expressions. Globbing patterns are expanded by the shell before a command is executed. The basic wildcards are:
*
: Matches any sequence of zero or more characters.?
: Matches any single character.[...]
: Matches any one of the enclosed characters. A hyphen can specify a range (e.g.,[0-9]
for any digit,[a-z]
for any lowercase letter).[!...]
: Matches any character not in the enclosed set.
For N-digit matching, the ?
wildcard is the most relevant, as it matches a single character. To match exactly N digits, you would repeat ?
N times. However, this becomes cumbersome for larger N. This is where extended globbing or regular expressions become essential.
# Matching a single digit
ls file_?.txt
# Matching exactly three digits
ls file_???.txt
# Matching a single digit using character class
ls file_[0-9].txt
# Matching exactly three digits using character class
ls file_[0-9][0-9][0-9].txt
Basic wildcard usage for digit matching
shopt -s nullglob
which makes unmatched globs expand to nothing.Enabling Extended Globbing for Advanced Patterns
Bash offers extended globbing features that provide more powerful pattern matching capabilities, similar to regular expressions but still operating at the globbing level. To enable these, you need to use the shopt -s extglob
command. Once enabled, you can use operators like:
?(pattern)
: Matches zero or one occurrence ofpattern
.*(pattern)
: Matches zero or more occurrences ofpattern
.+(pattern)
: Matches one or more occurrences ofpattern
.@(pattern)
: Matches exactly one occurrence ofpattern
.!(pattern)
: Matches anything exceptpattern
.
While these are powerful, they still don't directly offer a {n}
quantifier for exact N-digit matches. For that, we often combine them with character classes or resort to grep
with regular expressions.
# Enable extended globbing
shopt -s extglob
# Match files ending with one or more digits
ls file_+([0-9]).txt
# Match files ending with zero or one digit
ls file_?([0-9]).txt
# Match files ending with exactly one digit (same as [0-9])
ls file_@([0-9]).txt
Using extended globbing for digit patterns
flowchart TD A[Start] --> B{Need N-digit match?} B -- Yes --> C{Is `extglob` enabled?} C -- No --> D[Enable `shopt -s extglob`] D --> E{Exact N digits?} C -- Yes --> E E -- Yes, small N --> F[Use `[0-9]` repeated N times] E -- Yes, large N or complex --> G[Use `grep -E` with regex `[0-9]{N}`] E -- No, range of digits --> H[Use `+([0-9])` or `*([0-9])`] F --> I[End] G --> I H --> I B -- No --> J[Use standard wildcards `*`, `?`, `[...]`] J --> I
Decision flow for N-digit matching in Bash
Leveraging Regular Expressions with grep
For precise N-digit matching, especially when N is large or when you need more complex patterns, regular expressions (regex) are the most robust solution. While Bash's globbing doesn't directly support regex quantifiers like {n}
, you can pipe the output of ls
or find
to grep
.
The key regex quantifier for N-digit matching is {n}
, which matches the preceding element exactly n
times. For digits, this would be [0-9]{n}
.
To use extended regular expressions (which include {n}
), you typically use grep -E
or egrep
.
# Create some test files
touch file_1.txt file_12.txt file_123.txt file_1234.txt file_abc.txt
# Match files with exactly 3 digits in their name (before .txt)
ls | grep -E 'file_[0-9]{3}\.txt'
# Match files with exactly 2 digits in their name
ls | grep -E 'file_[0-9]{2}\.txt'
# Match files with 2 to 4 digits
ls | grep -E 'file_[0-9]{2,4}\.txt'
# Match files with at least 2 digits
ls | grep -E 'file_[0-9]{2,}\.txt'
Using grep -E
for N-digit matching with regular expressions
grep
with ls
, be aware that ls
output can be problematic if filenames contain special characters or newlines. For more robust scripting, find
with -regex
or find -print0 | xargs -0 grep
is often preferred.Practical Examples and Best Practices
Let's look at some common scenarios and the best way to handle them.
Scenario 1: Matching exactly 5 digits in a filename
If you have files like report_12345.csv
, report_123.csv
, report_abc.csv
and want only the 5-digit ones.
Scenario 2: Matching a range of digits
If you need files with 3 to 6 digits.
Scenario 3: Matching N digits within a string (not just filenames)
When processing text content, grep
is the go-to tool.
# Scenario 1: Exactly 5 digits in filename
# Using grep (most reliable)
ls | grep -E 'report_[0-9]{5}\.csv'
# If you absolutely must use globbing and N is small, enable extglob and repeat
# shopt -s extglob # (if not already enabled)
# ls report_?????([0-9]).csv # This is not ideal for exact N, as ? matches any char
# Better: ls report_[0-9][0-9][0-9][0-9][0-9].csv
# Scenario 2: 3 to 6 digits in filename
ls | grep -E 'report_[0-9]{3,6}\.csv'
# Scenario 3: N digits within a string
echo "The code is 12345 and the ID is 678." | grep -oE '[0-9]{5}'
# Output: 12345
echo "Order 123, Item 4567, Ref 890123" | grep -oE '[0-9]{3}'
# Output: 123\n456\n789\n012\n3
Advanced N-digit matching examples
-o
option with grep
prints only the matched (non-empty) parts of a matching line, with each match on a separate output line. This is very useful when extracting specific N-digit sequences.