International Components for Unicode

The ICU project is a technical committee of the Unicode Consortium and sponsored, supported, and used by IBM and many other companies.

[3] ICU provides the following services: Unicode text handling, full character properties, and character set conversions; Unicode regular expressions; full Unicode sets; character, word, and line boundaries; language-sensitive collation and searching; normalization, upper and lowercase conversion, and script transliterations; comprehensive locale data and resource bundle architecture via the Common Locale Data Repository (CLDR); multiple calendars and time zones; and rule-based formatting and parsing of dates, times, numbers, currencies, and messages.

[7] ICU 73.2 has improved significant changes for GB18030-2022 compliance support, i.e. for Chinese (that updated Chinese GB18030 Unicode Transformation Format standard is slightly incompatible); has "a modified character conversion table, mapping some GB18030 characters to Unicode characters that were encoded after GB18030-2005" and has a number of other changes such as improving Japanese and Korean short-text line breaking, and in "English, the name “Türkiye” is now used for the country instead of “Turkey” (the alternate spelling is also available in the data).

"[8] ICU 74 "updates to Unicode 15.1, including new characters, emoji, security mechanisms, and corresponding APIs and implementations.

ICU 70 added e.g. support for emoji properties of strings and can now be built and used with C++20 compilers (and "ICU operator==() and operator!=() functions now return bool instead of UBool, as an adjustment for incompatible changes in C++20"),[11] and as of that version the minimum Windows version is Windows 7.

[12] After Taligent became part of IBM in early 1996, Sun Microsystems decided that the new Java language should have better support for internationalization.

Both frameworks have been enhanced over time to support new facilities and new features of Unicode and Common Locale Data Repository (CLDR).

It ignores popular C++ idioms (the STL, RTTI, exceptions, etc), instead mostly mimicking the Java API.