Re Re, etc. ISO charsets; Unicode

Richard L. Goerwitz (goer@midway.uchicago.edu)
Mon, 26 Sep 94 23:32:14 CDT


>|>Has a formal mechanism been considered for specifying various popular
>|>coding standards, such as ISO 8859-7, ISO 8859-8, etc., and (perhaps
>|>off in the future) Unicode?
>
>Yes, it is a parameter to the text/xxx content type:-
>
>text/html; charset=ISO8859-7

But is this really a solution? Doesn't this specify a specific char-
acter set for the entire document, and not for arbitrary sections of
text?

>Again for any ancient language I suspect you will need multiple character
>sets for different periods, different script styles etc. RFC-822 is pretty
>much the same in gothic or helvetica.

Just for the record, most people working with ancient languages would be
happy with a single generic font. What they'd do is use this for general
transcription, and link in images for those who want to look at the ac-
tual texts. E.g. Greek and Latin texts are done in a modern typeface,
not in ancient uncial or other script. Akkadian is similar - in fact
there you can get away with ISO 8859-1 for the transliterations.

It is impossible, though, to put scholarly discussions of the classics,
for example, online until it's possible to encode multilingual texts.
Typical editions of Homer, for instance, have both the Greek text and
an extensive set of modern annotations. Lest anyone laugh (what? clas-
sical Greek online?), note that virtually the entire classical Greek
corpus is online right now, waiting for the Web to understand multi-
lingual and/or non-ISO-8859-1 text.

Richard Goerwitz
goer@midway.uchicago.edu