Regex remove all special characters except numbers?
Categories:
Regex to Remove All Special Characters Except Numbers in JavaScript
Learn how to effectively use regular expressions in JavaScript to strip out all characters from a string, leaving only numerical digits.
When working with user input or data from various sources, you often encounter strings that contain a mix of letters, numbers, and special characters. For many applications, such as phone number validation, ID extraction, or numerical calculations, you might need to isolate only the numerical digits. Regular expressions (regex) provide a powerful and concise way to achieve this in JavaScript.
Understanding the Core Regex Pattern
The key to removing all characters except numbers lies in defining what a 'number' is and then inverting that selection. In regex, \d
is a special sequence that matches any digit (0-9). To match anything that is not a digit, we use \D
. However, a more common and often preferred approach when you want to keep only numbers is to match the numbers themselves and then remove everything else. Alternatively, you can match anything that is not a number and replace it with an empty string.
const inputString = "abc123def456!@#789";
// Method 1: Replace non-digits with an empty string
const numbersOnly1 = inputString.replace(/\D/g, '');
console.log(numbersOnly1); // Output: "123456789"
// Method 2: Using a character set to match anything NOT a number
const numbersOnly2 = inputString.replace(/[^0-9]/g, '');
console.log(numbersOnly2); // Output: "123456789"
Basic regex patterns to extract numbers from a string.
g
flag in the regex /\D/g
is crucial. It stands for 'global' and ensures that all occurrences of non-digit characters are replaced, not just the first one.Detailed Explanation of the Regex Components
Let's break down the regex patterns used:
\D
: This is a shorthand character class that matches any character that is not a digit (equivalent to[^0-9]
).[^0-9]
: This is a negated character set. The^
inside the square brackets negates the set, meaning it matches any character that is not in the range 0 through 9./g
: The global flag. Without this flag,replace()
would only replace the first non-digit character it finds. Withg
, it replaces all of them.
Both \D
and [^0-9]
achieve the same result for this specific problem. The choice between them often comes down to personal preference or specific context where one might be clearer than the other.
flowchart TD A[Input String] --> B{"Match Pattern?"} B -- "Yes (Non-Digit)" --> C[Replace with ""] B -- "No (Digit)" --> D[Keep Character] C --> E[Continue Scan] D --> E[Continue Scan] E[End of String?] -- "No" --> B E[End of String?] -- "Yes" --> F[Output String]
Flowchart illustrating the regex replacement process.
Handling Edge Cases and Common Scenarios
Consider various inputs to ensure your regex works as expected:
- Empty String: An empty string will remain empty.
- String with only numbers: The string will remain unchanged.
- String with only special characters: The string will become empty.
- Numbers with spaces: Spaces are non-digits, so they will be removed.
If you need to preserve spaces or other specific characters alongside numbers, you would adjust the negated character set accordingly. For example, to keep numbers and spaces, you could use /[^0-9 ]/g
.
const testCases = [
"12345",
"Hello World! 123",
" -50.00 ",
"!@#$%^&*()",
"",
"Phone: +1 (555) 123-4567"
];
testCases.forEach(str => {
const result = str.replace(/\D/g, '');
console.log(`Original: "${str}" -> Cleaned: "${result}"`);
});
/* Output:
Original: "12345" -> Cleaned: "12345"
Original: "Hello World! 123" -> Cleaned: "123"
Original: " -50.00 " -> Cleaned: "5000"
Original: "!@#$%^&*()" -> Cleaned: ""
Original: "" -> Cleaned: ""
Original: "Phone: +1 (555) 123-4567" -> Cleaned: "15551234567"
*/
Testing the regex with various input strings.
.
) and negative signs (-
). If you need to preserve these for numerical values, you'll need a different regex, such as /[^0-9.-]/g
.