Zero-Width Joiner U+200D
U+200D8205‍\200D\u200D%E2%80%8DE2 80 8DFormat (Cf)General PunctuationThe zero-width joiner (ZWJ, U+200D) is an invisible Unicode character that forces adjacent characters to join or connect. While it takes up no visible space, it has a massive impact on how text and emoji render on screen.
Most people encounter the ZWJ every day without knowing it. Every time you send a family emoji, a profession emoji, or an emoji with a specific gender and skin tone, your device is using ZWJ characters behind the scenes to combine multiple individual emoji into one composite image.
Beyond emoji, ZWJ is essential in Arabic, Persian, Devanagari, and other complex scripts where it forces letters to connect even when they would normally appear in their isolated forms.
How ZWJ Creates Emoji
The ZWJ acts as invisible glue between emoji. Your device sees the sequence and, if it supports that combination, renders a single combined emoji instead of the individual parts.
Examples of ZWJ Sequences
The key thing to understand is that these "single" emoji are actually multiple characters joined by ZWJ. The Woman Technologist emoji is actually three Unicode code points: Woman (U+1F469) + ZWJ (U+200D) + Laptop (U+1F4BB). The family emoji can be seven or more code points long.
What Happens When ZWJ Sequences Aren't Supported
When you send a ZWJ emoji to a device or platform that does not support that specific sequence, the device falls back to showing the individual emoji separately. Instead of seeing one combined emoji, you see the component parts side by side.
This is why you sometimes see messages with strange emoji combinations. The sender's device combined them into one emoji, but your device showed them as separate characters because it doesn't support that specific ZWJ sequence.
Common Uses
- Creating composite emoji. Family groups, professions, skin tone + gender combinations, and accessibility emoji all use ZWJ sequences.
- Forcing connected letter forms in Arabic and Persian. In Arabic script, letters change shape depending on their position in a word. ZWJ forces the connected form even at word boundaries.
- Devanagari and Indic script ligature control. ZWJ creates specific conjunct forms in Hindi, Sanskrit, and other scripts.
- Complex text layout in South Asian writing systems. Bengali, Tamil, Telugu, and other scripts use ZWJ for precise control over how characters combine.
How to Type
ZWJ in Arabic and Persian Text
ZWJ was originally created for complex scripts, not emoji. In Arabic and Persian, letters have up to four forms: isolated, initial (start of word), medial (middle of word), and final (end of word). The form used depends on what letters are adjacent.
ZWJ forces a letter to use its connected form even when it would normally appear isolated. This is critical for correct typography in Persian, where the rules for when letters connect differ from Arabic, and writers need fine control over letter forms.
For example, placing ZWJ after the last letter of a word forces that letter to use its medial or initial form instead of its final form, which can change the meaning or visual style of the text.
Technical Details
ZWJ is classified as a Format character (Cf) in Unicode. Key technical properties:
- Width: Zero. It produces no visible glyph and takes up no horizontal space.
- Line breaking: ZWJ prevents line breaks at its position. The characters on either side of a ZWJ will not be split across lines.
- Joining behavior: ZWJ causes the characters on either side to use their joining (connected) forms, as if they were adjacent letters in the same word.
- Emoji behavior: When placed between two emoji characters, it triggers the rendering engine to look up the combined sequence in its emoji table.
In regular expressions, match ZWJ with \u200D. To count the "visual" length of a string containing ZWJ emoji, you need a grapheme cluster-aware library, since a single visual emoji may contain multiple code points joined by ZWJ.
Security Considerations
- String length confusion. A single visual emoji character may be 7+ code points long when it uses ZWJ. Code that measures string length by code points will get unexpected results.
- Text comparison issues. Searching for a person emoji will not match a gendered or skin-toned version of that emoji, since the ZWJ version is a different character sequence.
- Input validation bypass. Fields with character limits may be bypassed if the limit counts visual characters differently from code points.
Counting Characters in ZWJ Emoji
One of the trickiest consequences of ZWJ emoji is string length. A single visible emoji can be surprisingly long when measured in code points:
This matters when you are building character counters, enforcing length limits, or truncating text. Use the Intl.Segmenter API in JavaScript or a grapheme-cluster library to count visual characters instead of code points.
How ZWJ Differs from Similar Characters
These four zero-width characters have distinct, often opposite behaviors:
ZWJ and ZWNJ are direct opposites. See our page for details on preventing character connections, particularly in Persian and Arabic text.
Frequently Asked Questions
What does the zero-width joiner do in emoji?
Why do some emoji show as multiple characters instead of one?
How is ZWJ different from zero-width space?
Is ZWJ used in languages other than emoji?
How many emoji use ZWJ sequences?
Related Characters
Need to detect or remove Zero-Width Joiner characters in your text?
Open Invisible Character Viewer