Until we get our shiny new 12c database running on its shiny new box (and all the data shifted to it), we are living with a mix of databases. To begin with, the data we managed was mostly AU/NZ and Europeans stuff, and the character set is set accordingly. By which I mean one of those Eurocentric things and not UTF-8. We also have a bunch of columns in NVARCHAR2 with AL16UTF16 as the alternative character set.
I'm pretty sure the new database will start with UTF-8. But in the mean time I was responsible for trying to get emails out of the current database with data in various European and non-European character sets. My paths through that forest went as follows...
- It should just work. Let me test it.....Oh bugger.
- Okay, maybe if I put "utf-8" in various bits of the message.
- And switch the code so it uses NVARCHAR2 rather than defaulting to VARCHAR2.
- Oh....UTF-16 isn't the same as UTF-8. I need to convert it somehow
- So I can't put UTF-8 values in either my Eurocentric VARCHAR2 or UTF-16 NVARCHAR2.
- And I have to get this through SMTP, where you can still see the exposed bones of 7-bit ASCII,
AHA ! HTML Entities. That means I can get away with using ASCIISTR to convert the UTF-16 strings into a sequence of Hex values for each two-byte character. Then I stick a &#x in front of each character, and I have an HTML representation of the string !
It stinks of an ugly solution.
I think there should be a way of sending utf-16 in the content, but I couldn't get to it.
It doesn't help that email HTML is less capable than browser HTML, and has to support a variety of older clients (plus presenting an HTML email body inside of the HTML of a webmail client is always going to be awkward).