This completes the circle:
# HTML to MML
sgmls foo.html | xlisp html2mml.l >foo.mml
# now you can load foo.mml into frame
# Then you can save it as mif. Or, you
# can just do
mmltomif <foo.mml >foo.mif
# Then you can convert it back to html:
xlisp mif2html.l <foo.mif >foo.html
Note that this does not address the issue of converting "legacy
documents" currently in Frame to HTML. The Frame documents
have to use the right paragraph tags so that I can recognize
the SGML structure of the file.
But it has some interesting possibilities:
* Frame can be configured to convert files based on
their extension. So you can edit some frame config
file so that when you open foo.html, it invokes
the html2mml and mmltomif filters, and voila! you
can load WWW files into FrameMaker.
I think you can do the same thing for saving. So
you could use FrameMaker much like the NeXT browser.
* Frame has hypertext which is extensible through
RPC calls. I translated the HTML sequence
<A HREF="scheme:addr">text</a>
to
<italic>
<Marker <MType 8> <MText "message www scheme:addr">>
text
<noitalic>
in MML. Marker type 8 is hypertext. So when you click on
"text", the MText is invoked. "message www" means make
an RPC call to www with "scheme:addr" as the argument.
So we could write a www RPC client that fetches WWW nodes
and hands them to Frame.
I'm not going to distribute the code right now,
because 1) it's not very polished, and 2) the www-talk
audience didn't respond to my html->mml filter with
much enthusiasm (I assume that's because it required
you to build XLISP (easy) and SGMLs (bigger, but still
easy)).
There are a few things that I didn't bother to code yet:
* mapping Frame's funkey apostrophies and quotes
to plain ASCII. (in general, we want to convert
Frame's funky "Diacritic Encoding" to whatever character
set HTML uses (ISO latin-1?))
* mapping <A NAME="2"> to <MText "newlink 2"> and back.
* mapping <A HREF="#2"> to <MText "gotolink 2"> and back.
* mapping <A HREF="file:foo#bar"> to <MText "gotolink foo:bar"> and back.
Good night.
Dan
p.s. I discovered that the DTD in the WWW browser code
considers <LI>, <DT>, and <DD> to be empty elements.
I changed my copy of html.dtd accordingly. The one
in the web will need to be changed.
This causes the UL, OL, DL, etc. items to have mixed
content, which gives newlines all sorts of tricky
twists.
Mixed content is something to be avoided in SGML DTD's,
for reasons that are far too ugly to explain right now.