How to Count Characters in Java
Java strings are UTF-16 sequences — like JavaScript. String.length() returns code units, which for emoji means surrogate pair surprises. Use codePointCount() for actual Unicode characters and BreakIterator for visible graphemes. To see a working browser-based version of the same logic, try our Character Counter; for the equivalent JavaScript pattern, see the JavaScript tutorial.
Method 1: String.length() (code units)
Returns the number of UTF-16 code units. Reliable for ASCII; surrogate pairs for emoji count as 2.
String text = "Hello, world!";
System.out.println(text.length()); // 13
System.out.println("😀".length()); // 2 — surrogate pair
System.out.println("中".length()); // 1 — BMP characterMethod 2: codePointCount() (Unicode characters)
Counts actual Unicode code points, treating surrogate pairs as one.
String text = "Hi 😀!";
System.out.println(text.length()); // 6
System.out.println(text.codePointCount(0, text.length())); // 5Method 3: BreakIterator for grapheme clusters
The most accurate way to count user-perceived characters in Java.
import java.text.BreakIterator;
public static int graphemeCount(String text) {
BreakIterator iterator = BreakIterator.getCharacterInstance();
iterator.setText(text);
int count = 0;
while (iterator.next() != BreakIterator.DONE) {
count++;
}
return count;
}
// "👨👩👧👦" — family emoji
System.out.println(graphemeCount("👨👩👧👦")); // 1 — correct!Method 4: Byte counting (UTF-8)
For storage and API sizing — count bytes after encoding.
import java.nio.charset.StandardCharsets;
String text = "Hello, 世界 😀";
int utf8Bytes = text.getBytes(StandardCharsets.UTF_8).length;
System.out.println(utf8Bytes); // 17
int utf16Bytes = text.getBytes(StandardCharsets.UTF_16LE).length;
System.out.println(utf16Bytes); // 24Method 5: Specific character counts
Common patterns for counting categories or specific characters.
String text = "Hello, world!";
// Characters without spaces
long noSpaces = text.chars().filter(c -> !Character.isWhitespace(c)).count();
// Letter count
long letters = text.chars().filter(Character::isLetter).count();
// Count occurrences of 'l'
long ls = text.chars().filter(c -> c == 'l').count();Common Pitfalls
⚠String.length() on emoji is misleading
Emoji are usually surrogate pairs in UTF-16. 'A'.length() == 1; '😀'.length() == 2. Always use codePointCount() or BreakIterator for user-facing character limits.
⚠Char arrays don't match Unicode
char[] in Java is UTF-16 code units, not Unicode characters. Iterating with text.toCharArray() splits surrogate pairs.
⚠Stream chars() returns ints
text.chars() gives you ints (code units, not code points). Use text.codePoints() if you need code points.
See a Working Character Counter
Our Character Counter is built using the patterns from this tutorial. Open the dev tools to inspect the live implementation.
📊Open Character Counter