CppCon 2019 has ended
Friday, September 20 • 13:30 - 14:30
Catch ⬆️: Unicode for C++23

Log in to save this to your schedule, view media, leave feedback and see who's attending!

It's 2019 and Unicode is still barely supported in both the C and C++ standards.

From the POSIX standard requiring a single-byte encoding by default, heavy limitations placed in codecvt facets in C++, and the utter lack of UTF8/16/32 multi-unit conversion functions by the standard, the programming languages that have shaped the face of development from operating systems, embedded devices and mobile applications has pushed forward a world that is incredibly unfriendly to a world of text beyond ASCII. Developers frequently roll their own solutions, and almost every major codebase -- from Chrome to Firefox to Qt to Copperspice and more -- all have their own variations of hand-crafted text processing. With no standard implementation in C++ and libraries split between various third party implementations and ICU, it is increasingly difficult and error-prone to handle what is the basic means of communication between people on the planet using C++, let alone the security holes found in hand-rolled libraries that do not carefully handle this tricky design space.

Small victories for character types that represent UTF8 exclusively, mandated UTF16 and UTF32 encoding for literals, and updating the Unicode Standard Reference in C++ have been accomplished by Study Group 16, who are the Unicode Arm of Standard C++. With the last of the foundational work to the C standards committee underway and participation from individuals at Mozilla, Google, Qt, Microsoft, Bloomberg, and Apple informing the design, Unicode in C++ is planned to be the biggest and best addition of first-class support for Unicode in C++.

This talk is going to be an overview of the problem space -- Text in C++ -- the people who are tackling the problem -- Study Group 16 -- and the first major libraries and works to be produced for handling encoding and normalization both flexibly and efficiently. It will talk about what we learned from its predecessors -- Boost.Text, text_view, ICU and Ogonek -- and what can be reasonably expected for C++23 and what aspirations SG16 has for the future. Come see the new face of a range-friendly encoding, decoding and normalization interfaces in C++.

avatar for JeanHeyd Meneide

JeanHeyd Meneide

Student, Columbia Unviersity
JeanHeyd "ThePhD" is a student at Columbia University in New York. Most of his programming is for fun and as a hobby, even if his largest open-source contribution -- sol2 -- is used across many industries. He is currently working towards earning his own nickname, climbing the academic... Read More →

Friday September 20, 2019 13:30 - 14:30 MDT
Crest 3
  • Parsing/Text and I/O