JavaScript function to convert UTF8 string between fullwidth and halfwidth forms
Categories:
Converting UTF-8 Strings: Fullwidth to Halfwidth and Vice Versa in JavaScript

Learn how to create a robust JavaScript function to convert UTF-8 encoded strings between their fullwidth and halfwidth character forms, essential for internationalization and data normalization.
In many East Asian languages, characters can exist in both fullwidth (全角, zenkaku) and halfwidth (半角, hankaku) forms. While visually similar, these forms have different Unicode code points and can cause issues in data processing, search, and display if not handled consistently. This article provides a comprehensive JavaScript solution to convert strings between these two forms, focusing on common character ranges.
Understanding Fullwidth and Halfwidth Characters
Fullwidth characters typically occupy the same horizontal space as two halfwidth characters (like standard ASCII letters). They are often used in East Asian typography to align with CJK (Chinese, Japanese, Korean) characters, which are inherently fullwidth. Halfwidth characters, on the other hand, are commonly used for Latin letters, numbers, and symbols, occupying less horizontal space.
The conversion process involves mapping specific Unicode ranges. For instance, the fullwidth ASCII range (U+FF01 to U+FF5E) corresponds to the halfwidth ASCII range (U+0021 to U+007E). Similarly, fullwidth Katakana characters have their halfwidth counterparts. The key is to identify these ranges and apply a consistent offset for conversion.
flowchart TD A[Input String] --> B{Iterate Characters} B --> C{Is Fullwidth ASCII?} C -->|Yes| D[Convert to Halfwidth ASCII] C -->|No| E{Is Halfwidth Katakana?} E -->|Yes| F[Convert to Fullwidth Katakana] E -->|No| G{Is Fullwidth Katakana?} G -->|Yes| H[Convert to Halfwidth Katakana] G -->|No| I[Keep Original Character] D --> J[Append to Result] F --> J H --> J I --> J J --> B B --> K[Output Converted String]
Flowchart of the character conversion logic
Implementing the Conversion Function
Our JavaScript function will take a string and a toFullwidth
boolean flag. If toFullwidth
is true, it converts halfwidth characters to fullwidth; otherwise, it converts fullwidth to halfwidth. The core logic involves iterating through each character, checking its Unicode code point, and applying an offset if it falls within a convertible range.
We'll handle several key ranges:
- ASCII Punctuation and Numbers: Fullwidth
!
(U+FF01) to~
(U+FF5E) and halfwidth!
(U+0021) to~
(U+007E). - Space Character: Fullwidth
(U+0020).
- Katakana: Fullwidth Katakana (U+30A1 to U+30F6) and halfwidth Katakana (U+FF66 to U+FF9F).
Special attention is needed for the space character, as its fullwidth form (U+3000) does not directly map with the same offset as other ASCII characters.
function convertWidth(str, toFullwidth = false) {
let result = '';
for (let i = 0; i < str.length; i++) {
const charCode = str.charCodeAt(i);
let convertedChar = str[i];
if (toFullwidth) {
// Convert halfwidth to fullwidth
if (charCode >= 0x0021 && charCode <= 0x007E) { // Halfwidth ASCII ! to ~
convertedChar = String.fromCharCode(charCode + 0xFF00 - 0x0020); // Offset 0xFEE0
} else if (charCode === 0x0020) { // Halfwidth space
convertedChar = String.fromCharCode(0x3000); // Fullwidth space
} else if (charCode >= 0xFF61 && charCode <= 0xFF9F) { // Halfwidth Katakana
convertedChar = String.fromCharCode(charCode - 0xFF61 + 0x30A1); // Offset
}
} else {
// Convert fullwidth to halfwidth
if (charCode >= 0xFF01 && charCode <= 0xFF5E) { // Fullwidth ASCII ! to ~
convertedChar = String.fromCharCode(charCode - 0xFF00 + 0x0020); // Offset 0xFEE0
} else if (charCode === 0x3000) { // Fullwidth space
convertedChar = String.fromCharCode(0x0020); // Halfwidth space
} else if (charCode >= 0x30A1 && charCode <= 0x30F6) { // Fullwidth Katakana
convertedChar = String.fromCharCode(charCode - 0x30A1 + 0xFF61); // Offset
}
}
result += convertedChar;
}
return result;
}
// Example Usage:
const halfwidthString = "Hello, World! 123 ABCアイウ";
const fullwidthString = "Hello, World! 123 ABCアィウ";
console.log("Original Halfwidth:", halfwidthString);
console.log("Converted to Fullwidth:", convertWidth(halfwidthString, true));
console.log("\nOriginal Fullwidth:", fullwidthString);
console.log("Converted to Halfwidth:", convertWidth(fullwidthString, false));
Considerations and Edge Cases
While the provided function covers many common use cases, it's important to be aware of potential edge cases and limitations:
- Unicode Normalization: This conversion is distinct from Unicode normalization forms (NFC, NFD, etc.). While related to character representation, normalization deals with combining characters and canonical equivalents, whereas fullwidth/halfwidth conversion is about specific character width variants.
- Character Completeness: Not all characters have a direct fullwidth or halfwidth equivalent. Characters outside the defined ranges will remain unchanged by this function.
- Performance: For extremely long strings, iterating character by character might have performance implications. However, for typical web application string lengths, this approach is generally efficient enough.
- Contextual Conversion: In some advanced scenarios, the conversion might depend on the surrounding text or specific linguistic rules. This function performs a direct, character-by-character mapping, which is suitable for most data normalization tasks.
1. Integrate the Function
Copy the convertWidth
function into your JavaScript project. You can place it in a utility file or directly within the script where you need it.
2. Call with toFullwidth = true
To convert a halfwidth string to its fullwidth equivalent, call the function with the second argument set to true
: convertWidth('abc', true)
will return 'abc'
.
3. Call with toFullwidth = false
To convert a fullwidth string to its halfwidth equivalent, call the function with the second argument set to false
(or omit it, as it defaults to false
): convertWidth('ABC')
will return 'ABC'
.
4. Test Thoroughly
Always test your implementation with various strings, including those with mixed character types, special symbols, and characters that do not have fullwidth/halfwidth equivalents, to ensure it behaves as expected for your specific use case.