char

202407191538
Status: #idea
Tags: Java

char

Attention

Unicode code points range from U+0_0000 to U+1_FFFF as of now. However, their encodings can support till U+FFFF_FFFF.

Unicode encodings

UTF-8

Pasted image 20240719143810.png

Warning

1 byte has 1 fixed bit → 7 effective bits
2 bytes has 5 fixed bits → 11 effective bits
3 bytes has 8 fixed bits → 16 effective bits
4 bytes has 11 fixed bits → 21 effective bits


References

  1. https://www.oracle.com/technical-resources/articles/javase/supplementary.html
  2. https://docs.oracle.com/en/java/javase/22/docs/api/java.base/java/lang/Character.html#unicode
  3. https://www.wikiwand.com/en/UTF-8#Encoding