21.6.1 Understanding how Mif2Go represents characters
Mif2Go has two internal ways to represent text: as printable strings, or as single characters. For compactness, Mif2Go uses the single-character form for all of the following:
• characters not in the regular printable set, such as curly quotes (straight quotes are in the printable set)
• a printable string that consists of only one character
• a printable character to which a character format has been singly applied.
Mapped entity references use Unicode
The high ASCII characters, which are not in the printable set, are heavily used in Windows. The ASCII codes for these characters are not valid in Unicode. However, the same glyphs occur in Unicode at other code points, so Mif2Go first maps them to their Unicode counterparts. For example, a bullet character in your FrameMaker document becomes numeric entity reference • in HTML. The ASCII decimal code for a bullet is 149, whereas the Unicode decimal code for a bullet is 8226. This mapping is applied only to text in the single-character form.
If you specify the following option, Mif2Go omits some high ASCII characters and maps others to printable characters:
See §13.16.2 Replacing high ASCII characters for W3C validation.
> 21 Mapping text formats to HTML/XML > 21.6 Mapping special characters > 21.6.1 Understanding how Mif2Go represents characters