> 13 Converting to HTML/XHTML > 13.4 Supplying values for the <head> element > 13.4.3 Specifying character encoding for HTML > 13.4.3.3 Specifying encoding for double-byte characters
Character encoding determines what method is used to represent double-byte characters in the <body> section of HTML output. To specify encoding or, alternatively, numeric references:
; Encoding = ISO-8859-1 (HTML default, numeric refs),
; or None (write 0x80-0xFF as single characters)
; QuotedEncoding = No (default, W3C usage, required for JavaHelp),
; or Yes (put encoding in meta tag in single quotes, needed by some
; NumericCharRefs = Yes (default, always use &#nnn;)
For XHTML, the Mif2Go default is to claim UTF-8 as the encoding, but to use numeric references of the form &#nnn; for all characters that would have to be encoded; this satisfies all browsers. That is, Mif2Go does not actually produce any characters with values greater than 127 using the UTF-8 encoding; instead, Mif2Go uses entities for such characters, readable under any eight-bit encoding scheme.
For XHTML, you can specify a value for XMLEncoding (see §14.3.3 Specifying character encoding for generic XML) other than the default UTF-8. If you set Encoding=UTF-8, you get real UTF-8 encoding (two characters) in place of the numeric character references. However, you can still force use of numeric references by also setting NumericCharRefs=Yes.
While Encoding=None is not strictly compliant, this setting can be useful in places like Russia, where almost the entire text would otherwise consist of numeric character references. Encoding=None provides a 6:1 reduction in such references.
To direct Mif2Go to supply single quotes around the charset attribute value, specify QuotedEncoding=Yes:
<meta http-equiv="Content-type" content="text/html; charset='ISO-8859-1'">
The default is not to enclose the value in quotes.
§13.16.2 Replacing high ASCII characters for W3C validation