Platform-Independent Strings

The preferred representation of textual data in the form of strings is different depending on the platform. For example, in most Windows APIs, the convention is to use double-byte character strings, with UTF-16 encoding. On many other platforms, single-byte character strings with UTF-8 (or ANSI) encoding is the standard. When sending textual data across a network, UTF-8 is the norm.

This presents us with a challenge in writing platform-independent code that is also efficient. The solution we have arrived at is to define a typedef utility::string_t , which corresponds to the platform preference. Most of the APIs are defined in terms of this typedef instead of either std::string or std::wstring, as was the case. There is also a utility::char_t definition.

To deal with string literals in a platform-specific way, Casablanca defines a macro U(str), which takes a string literal and produces the platform-specific type:


auto uri = http::uri(U("http://127.0.0.1:10000"));


Since existing code may be defined in terms of other types than the platform-preferred string, Casablanca also defines conversions to and from string_t, which may wind up being no-ops when the target or source is already what the platform prefers:

namespace utility
 {
    namespace conversions
    {
        static utility::string_t to_string_t(const std::string &s);
        static utility::string_t to_string_t(const utf16string &s);

        static utility::string_t to_utf16string(const utility::string_t &s);
        static utility::string_t to_utf8string(const utility::string_t &s);
    }
}

The 'utf16string' typedef is necessary in order to clearly define a two-byte string, since wchar_t, the base for std::wstring, is not always two bytes in size.

Last edited May 30, 2013 at 12:36 AM by sanamithani, version 11