Perhaps I should have been more explicit. The following:
<img alt="> some text <">
is, according to [1], a valid construct. However, my parser will look at
this and assume that it's more likely to be an erroneously unclosed literal
inside of a tag than a literal that happens to contain '<'.
I deliberately chose to do it this way because people rarely (probably never)
use '>' and '<' in literals, but they often forget to close literals, thanks
to the brain-dead parsers in Mosaic and Netscape.
I do something similar in a case like this:
<a href="http://foo.bar.org/" <img alt="some text"></a>
When I see this, I assume that the author omitted the trailing '>' on the
anchor tag. I flag this as a lexical syntax error and implicitly insert the
missing '>' character.
References:
[1] http://www.iaf.nl/~abigail/HTML/Myth/myth.html
Michael Johnson
Relay Technology, Inc.