Characters are not bytes
A "character" you see on screen can take one, two, three or four bytes once it is stored or sent over the wire as UTF-8. That difference is why an SMS with one emoji costs more than a 20-letter word, and why a 255-character database column can reject text that "looks" short:
| Example | UTF-8 bytes | Range |
|---|---|---|
A b 7 ! (ASCII) | 1 each | Latin letters, digits, basic punctuation |
é ñ ü £ | 2 each | Accented Latin, Greek, Cyrillic, symbols |
中 あ 한 | 3 each | Most CJK, many other scripts |
😀 🎉 𝕏 | 4 each | Emoji, rare CJK, math alphanumerics |
The UTF-8 size stat above counts the exact bytes, so it is the figure to trust for column limits, payload sizes and SMS segments.
Gotcha — emoji count as two. JavaScript measures string length in UTF-16 code units, so a single emoji like 😀 reports as 2 characters even though a person sees one. The byte count above is more honest for storage; the character count matches what "😀".length returns in code.
Common character limits
- SMS — 160 characters for plain GSM text; a single non-GSM character (most emoji, some accents) switches the whole message to 70-character segments.
- Meta descriptions — search results show roughly 150–160 characters, so anything beyond that is cut off.
- Social posts and bios — each platform sets its own ceiling, and several count emoji and links differently from plain letters.
- Database columns — a
VARCHAR(255)limit is about characters, but the byte size depends on the encoding, which matters for multi-byte text.
So a "1-character" emoji can quietly blow an SMS budget: if a 160-character message suddenly splits into several, an emoji or accented character forced the longer Unicode encoding.
One emoji, several "characters"
Ask JavaScript how long "👍" is and it says 2. Ask it about a family emoji like 👨👩👧👦 and you might get 11. Neither answer is a bug — String.length counts UTF-16 code units, not the symbols a person sees. Emoji outside the basic range take two code units (a "surrogate pair"), and the more elaborate ones are several separate emoji glued together with invisible zero-width joiners, plus skin-tone modifiers on top. So the thing your eye reads as one character can be a dozen code units under the hood.
When you actually need the human count, String.length is the wrong tool. [...str].length or Array.from(str).length counts code points, which handles surrogate pairs, and Intl.Segmenter goes further and counts grapheme clusters, which is the closest thing to "what a person would point at". Most of the time the difference doesn't matter; the moment it does — a strict character limit, truncating a string for display, validating input — reaching for plain .length will quietly cut an emoji in half and leave a broken glyph behind.
Platforms count differently on purpose
There's no single notion of "length" once you leave your editor. X/Twitter weights its limit: a URL always costs 23 characters no matter its real length, and characters in scripts like Chinese or Japanese count as two. Databases care about bytes, not glyphs, so a VARCHAR(255) on a utf8mb4 column holds 255 characters' worth of bytes, and a column full of emoji fills up far faster than one full of ASCII. SMS is the classic trap: a plain message gets 160 characters per segment, but a single emoji or curly quote flips the whole thing to a 70-character Unicode encoding, which is how a "short" text quietly becomes three billed messages.
Does the byte count matter, or just the character count?
Both, in different places. Request and payload size limits, database column storage, and SMS segments are measured in bytes, so multi-byte characters (accents, CJK, emoji) fill them faster than the character count suggests. The UTF-8 byte figure is the one to trust for those.
How do I count what a person actually sees, not code units?
In JavaScript, [...str].length counts code points (handling emoji surrogate pairs), and Intl.Segmenter counts grapheme clusters, which is closest to perceived characters. Plain str.length counts UTF-16 units and will over-count emoji.
How long should a meta description be?
Aim for about 150–160 characters. Search engines truncate longer descriptions in the results snippet, so put the important words first.
Which number is the SMS limit?
Characters, but with a catch: a plain GSM-7 message fits 160 characters per segment, while any emoji or many accented letters switch the whole message to UCS-2, dropping the limit to 70 per segment. When in doubt, watch the byte size.
Does it count spaces and line breaks?
Yes — the main character count includes every space, tab and line break. "No spaces" strips all whitespace, and "Lines" counts line breaks plus one.
Is anything sent to a server?
No. All counting runs in your browser; your text is never transmitted or stored.