Omni Systems, Inc.

  

Mif2Go User's Guide, Version 55

  

Valid HTML 4.01!

 

Made with Mif2Go

13 Converting to HTML/XHTML > 13.16 Passing W3C validation tests > 13.16.2 Replacing high ASCII characters for W3C validation


13.16.2 Replacing high ASCII characters for W3C validation

W3C validation tests complain if a file includes any characters with ASCII decimal values 128 through 159. Presence of these characters does not preclude validation. However, if the file contains real validation errors, the W3C validator reports these characters along with the actual errors. If you fix the errors, and leave the characters, the complaint becomes just a note about “non-SGML” characters.

Note:  Leaving these characters in your document does not make the output invalid, despite the somewhat misleading way the W3C validator lists them when something else in the output is not valid.

For most purposes you should not need to do anything about the characters in question. However, if you want to have Mif2Go remap or remove the offending characters, you can set the following option:

[HTMLOptions]

; ValidOnly = No (default, allow normal use of chars from 128 to 160),

;  or Yes (for warning-free W3C validation, remaps or removes

;  those chars)

ValidOnly=Yes

This option affects the following characters:

128 through 159 (first 32 high ASCII characters), in all fonts except the following:

Symbol

Zapf Dingbats

Webdings

171 and 187 (the guillemets), in macros only.

Setting ValidOnly=Yes changes the output as follows:

curly quotes become straight quotes

en dashes become hyphens

em dashes become a pair of hyphens

bullets (except those produced by <ul> tags) become mid-dots

all other characters in the range are dropped, unless you map them yourself; see §21.5 Assigning properties to text formats.

Table 13-6 shows how Mif2Go treats characters in this range when ValidOnly=Yes. Depending on which version of the Mif2Go User’s Guide you are using to view the table, some characters might not be displayed.

Table 13-6 Characters replaced or removed for W3C validation

Value

Character

Name

Replacement character (if any)

128

euro

Removed

129

(none)

(none)

Removed

130

single base quote

' 039 (single quote)

131

ƒ

florin

Removed

132

double base quote

" 034 (double quote)

133

ellipsis

Removed

134

dagger

Removed

135

double dagger

Removed

136

ˆ

circumflex

Removed

137

per thousand

Removed

138

Š

S caron

Removed

139

left single guillemet

Removed

140

Œ

OE ligature

Removed

141

˘

(none)

Removed

142

Ž

Z caron

Removed

143

(none)

(none)

Removed

144

(none)

(none)

Removed

145

left single quote

' 039 (single quote)

146

right single quote

' 039 (single quote)

147

left double quote

" 034 (double quote)

148

right double quote

" 034 (double quote)

149

bullet

· 183 (mid-dot), except in <ul> lists

150

en dash

- 045 (hyphen)

151

em dash

- 045 (hyphen) in text,

-- (two hyphens) in macros

152

˜

tilde

Removed

153

trademark

Removed

154

š

s caron

Removed

155

right single guillemet

Removed

156

œ

oe ligature

Removed

157

˝

(varies; not used)

Removed

158

ž

z caron

Removed

159

Ÿ

Y diaeresis

Removed

 ···

171

«

left double guillemet

" 034 (double quote), in macros only

187

»

right double guillemet

" 034 (double quote), in macros only

See also:

§13.4.3 Specifying character encoding for HTML

§14.3.3 Specifying character encoding for generic XML

§21.5 Assigning properties to text formats



13 Converting to HTML/XHTML > 13.16 Passing W3C validation tests > 13.16.2 Replacing high ASCII characters for W3C validation