Unicode Converter
Convert text to \uXXXX Unicode escapes and back
Frequently Asked Questions
What does the Unicode converter do?
It converts between text and \uXXXX Unicode escapes and shows each character's code point plus its UTF-8 byte sequence. Useful when handling non-ASCII characters in JavaScript, Python, or Java source, and when debugging encoding issues.
Typical use cases?
Escaping CJK strings for embedding in JSON or regex; turning source like \u4e2d\u6587 back into the readable text; inspecting emoji code points (including ZWJ sequences); diagnosing log mojibake by pinning down the exact bytes.
Does it support surrogate pairs and supplementary planes?
Yes. Characters above U+FFFF (most emoji, historical scripts) require surrogate pairs like \uD83D\uDE00. The tool both emits and parses pairs correctly and can optionally use the ES2015+ \u{1F600} shorthand.
How do UTF-8, UTF-16, and code points relate?
Code point is the unique Unicode number (U+4E2D = "中"). UTF-8 encodes code points as 1–4 bytes — the web default. UTF-16 is JavaScript in-memory representation — 2 bytes for BMP, surrogate pairs for supplementary planes. The tool surfaces all three views at once for intuitive understanding.
Is my data uploaded?
No. Conversion relies on the browser native JavaScript methods and stays entirely on your device.
How is Unicode conversion different from URL or Base64 encoding?
Unicode escapes target source-code readability and editor compatibility. URL encoding is for URL contexts only. Base64 turns arbitrary bytes into ASCII. Each serves a distinct purpose and is not interchangeable.