c++ - Convert between signed char & unsigned char representing UTF8 -
i using libxml2 , icu in same project. represent utf8 differently. libxml2 uses unsigned char*, , icu constructors take in plain char* (which on pentium 64-bit equivalent signed char).
question: how convert between two? can use static_cast?
i understand utf8 cares underlying data type @ least 8 bits long. both signed char , unsigned char satisfy this. wondering if there gotcha here? corner cases?
edit: @ compiler's (g++/gentoo) insistence, reinterpret_cast can conversion (without relying on c-style cast). let's have 2 unsigned char strings: 0000 , 1000. conversion turn them both 0. possible under utf8?
some libraries use char
storing utf-8, others use unsigned char
.
in case may need cast between char*
, unsigned char*
using reinterpret_cast
, since these types have same storage unit size , alignment. e.g.:
char const* s = ...; unsigned char const* p = reinterpret_cast<unsigned char const*>(s);
static_cast
can simulate reinterpret_cast
through intermediate conversion void*
, e.g. char* -> void* -> unsigned char*
, e.g.:
char const* s = ...; void const* intermediate = s; unsigned char const* p = static_cast<unsigned char const*>(intermediate);
Comments
Post a Comment