On 30/07/2015 at 20:54, xxxxxxxx wrote:
storing utf-8 encoded string in a std::wstring doesn't make much sense in my opinion, as utf-8 uses 8 bit code units. Storing utf-8 encoded data in a std::string should be perfectly fine. What you can store in an std::wstring depends on the platform, on windows this would be utf-16, on linux or osx it's utf-32 if i recall correctly.
As said before std::string and std::wstring are both unicode agnostic (they're just a sequence of char/wchar_t). So if you store utf-8 or utf-16 encoded data (both variable length encodings) in a std::string or a std::wstring you shouldn't use the builtin string operations as these can invalidate the utf encoding. With utf-32 that shouldn't be a problem as that is a fixed length encoding.
Now, what type of string you should be using really depends on what you're doing with the string. On windows a std::wstring with utf-16 encoded data is often a safe bet because the windows api uses utf-16.
So if you're getting thrash characters it really depends on what are you actually doing with that string.
To help you any further please post the code where you convert the string and the code where you're actually using it (e.g. to call some api function).