HTML and inlining external apps (CCI++)

David C. Martin (dcmartin@library.ucsf.edu)
Sat, 21 Jan 1995 23:49:57 +0100


We at UCSF have implemented a mechanism to handle arbitrary
MIME types in hypermedia user agents (e.g. Mosaic). Below is
a proposal for discussion of our protocol with revisions based
on the latest HTTP/1.0 specification and the proposed CCI
protocol from NCSA.

Our current implementation is for XMosaic 2.4 and we plan on soon
releasing the source code for this version in addition to some
sample applications that enhance Mosaic (e.g. an embedded MPEG
movie player).

We would like your comments on this protocol.

dcm
- ---
Assistant Director mail: dcmartin@ckm.ucsf.edu
UCSF Library & Ctr for Knowledge Mgt at&t: 415/476-6111
530 Parnassus Avenue, Box 0840 fax: 415/476-4653
San Francisco, California 94143 page: 415/719-4846
- -------
Constituent Component Interface++ -- CCI++ C. S. Ang
University of California - San Francisco D.C. Martin

Constituent Component Interface++ -- CCI++

Status of this Memo

This document is an Internet-Draft style protocol compiled at the
Center for Knowledge Management at the University of California,
San Francisco.

Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other
documents at any time. It is inappropriate to use Internet- Drafts
as reference material or to cite them other than as "work in
progress."

Distribution of this document is unlimited. Please send comments to
the authors at <dcmartin@ckm.ucsf.edu> and <cheong@ckm.ucsf.edu>.

Abstract

The Constituent Component Interface++ (CCI++) is an application-
level protocol to support integrated presentation of structured
hypermedia documents. A practical hypermedia user agent (HMUA)
requires support for a wide variety of data types, presentation
techniques and access protocols. CCI++ provides a basis to build
cooperative suites of small applications that work in concert to
produce an integrated presentation of information retrieved from
network accessible resources. Component applications extend both
the presentation and access repertoire of a hypermedia user agent
by providing content-type support (e.g. PostScript=99) as well as
protocol support (e.g. NNTP).

CCI++ was first named the Distributed Hypermedia Object Embedding
(DHOE) Protocol [1] and was used in a distributed compound object
system in late 1993 to integrate texts, movies, images, and 3D
object data, resided on various sites of a network via a two-way
messaging mechanism between the compound object container (browser)
and the object handlers. The original DHOE Protocol has then been
modified to free the protocol from X Window System dependence and
this specification reflects preferred usage of the protocol.

Table of Contents

1. Introduction
1.1 Purpose
1.2 Overall Operation
2. Notational Conventions and Generic Grammar
3. CCI++ Message
4. Usage of RFC 822 and MIME Constructs
5. Request
5.1 Request-Line
5.2 Method
5.2.1 GET
5.2.2 POST
5.2.3 DISPLAY
5.2.4 SEND
5.2.5 DISCONNECT
5.2.6 QUIT
5.2.7 CONFIGURE
5.2.8 EVENT
5.2.9 UPDATE
6. Response
6.1 Status-Line
6.2 CCI++ Version
6.3 Status Codes and Reason Phrases
6.3.1 Successful 2xx
6.3.2 Redirection 3xx
6.3.3 Client Error 4xx
6.3.4 Server Errors 5xx
7. References

1. Introduction

1.1 Purpose

The Constituent Component Interface++ (CCI++) is an application-
level protocol to support integrated presentation of structured
hypermedia documents. A practical hypermedia user agent (HMUA)
requires support for a wide variety of data types, presentation
techniques and access protocols. CCI++ provides a basis to build
cooperative suites of small applications that work in concert to
produce an integrated presentation of information retrieved from
network accessible resources. Component applications extend both
the presentation and access repertoire of a hypermedia user agent
by providing content-type support (e.g. PostScript=99) as well as
protocol support (e.g. NNTP).

CCI++ is based on the de facto standards for resource reference
(Universal Resource Identifier - URI) [2], location (Universal
Resource Locator - URL) [3], name (Universal Resource Name - URN)
[4]; access protocol (Hypertext Transfer Protocol - HTTP); content-
type and format specification (Multipurpose Internet Mail
Extensions - MIME) [4]; and document structure (Hypertext Markup
Language - HTML). CCI++ was first termed Distributed Hypermedia
Object Embedding (DHOE) and was used in a open and distributed
compound object system in late 1993 to integrate text, movies,
images, and 3D object data from a variety of network servers [1].

Note that since CCI++ is derived from CCI and HTTP, it adopts most
of the conventions, rules and characteristics of CCI and HTTP, and
this document is compiled based on the HTTP/1.0 draft.

1.2 Overall Operation

The CCI++ protocol is based on a request/response paradigm. Each
participating CCI++ application sends and receives messages
asynchronously in communication with a hypermedia user agent
(HMUA). The HMUA initiates all cooperating applications and
manages the presentation of the integrated hypermedia document. A
requesting program (termed a client) establishes a connection with
a receiving program (termed a server) and sends a request to the
server in the form of a request method, URI, and protocol version,
followed by a MIME-like message containing request modifiers,
client information, and possible body content. The server responds
with a status line (including its protocol version and a success or
error code), followed by a MIME-like message containing server
information, object meta-information, and possible body content. It
should be noted that a given program may be capable of being both a
client and a server; our use of those terms refers only to the role
being performed by the program during a particular connection,
rather than to the program's purpose in general.

On the Internet, the communication generally takes place over a
TCP/IP connection. The default port is TCP 8000, but other ports
can be used. This does not preclude the CCI++ protocol from being
implemented on top of any other protocol on the Internet, or on
other networks. The mapping of the CCI++ request and response
structures onto the transport data units of the protocol in
question is outside the scope of this specification.

For the current implementation, the client establishes connection
to the server listening on a given port, and the connection is kept
alive until the service is no longer needed. However, this is not
a feature of the protocol and is not required by this
specification. Both clients and servers must be capable of handling
cases where either party closes the connection prematurely, due to
user action, automated time-out, or program failure. In any case,
the closing of the connection by either or both parties always
terminates the current request, regardless of its status. Although
not explicitly stated in the protocol, a client (e.g. browser)
usually acquires the locations of various available servers, either
through start-up resources, or from another server. A client may
also launch the server as a data handler to decipher the data types
it is unable to handle.

Each HMUA supports an inherent set of data types (e.g. only MIME
content-types =93text/html=94 and =93image/gif=94 are supported by NCSA
Mosaic) and an inherent set of protocols; other types and procotols
are supported through helper applications and gateways. However,
helper applications are not integrated with the browser and gateway
functionality is dependent on the existence of a gateway server.
The CCI procotol extends the HMUA with support for abitrary data
types and access protocols.

The HMUA initiates a CCI++ connection with a locally defined
application whenever a server responds with a non-inherent content-
type. The local application may simply translate the format (e.g.
=93image/jpeg=94 to =93image/gif=94), acting as a local filter, or it may
present a control panel and allow the user to control the
integrated presentation of the resource (e.g. an =93image/mpeg=94 movie
viewer). In addition to content-types, the HMUA also may launch a
local application when attempting to access a resource via a non-
inherent protocol.

Example:

1. Mosaic requests URL http://<server>/<opaque>
2. HTTP server responds with a content-type of
=93application/pdf=94
3. Mosaic launches local application to handle Adobe PDF
format
4. PDF helper application returns =93text/html=94 content to
Mosaic:
<HTML>
<HEAD>
<TITLE>CCI Protocol</TITLE>
</HEAD>
<BODY>
<A HREF=3D=93http://<server>/<opaque>=94>
<IMG SRC=3D=93CCI%20Protocol/Page1=94 ISMAP>
</A>
</BODY>
</HTML>
5. Mosaic displays the content with the mapped image
6. User clicks on mapped image
7. Mosaic informs local application of requested resource:
301 GET http://<server>/<opaque>?301,402
8. Local application interprets URL and sends content to
Mosaic

For the current implementation, the client establishes connection
to the server listening on a given port, and the connection is kept
alive until the service is no longer needed. However, this is not
a feature of the protocol and is not required by this
specification. Both clients and servers must be capable of handling
cases where either party closes the connection prematurely, due to
user action, automated time-out, or program failure. In any case,
the closing of the connection by either or both parties always
terminates the current request, regardless of its status. Although
not explicitly stated in the protocol, a client (e.g. browser)
usually acquires the locations of various available servers, either
through start-up resources, or from another server. A client may
also launch the server as a data handler to decipher the data types
it is unable to handle.

2. Notational Conventions and Generic Grammar

Please refer to the HTTP/1.0 draft for information about Augmented
BNF and basic rules.

3. CCI++ Message

CCI++ messages consist of requests from client to server and
responses from server to client.

CCI++-message =3D Simple-Request
| Simple-Response
| Full-Request
| Full-Response

Full-Request and Full-Response use the generic message format of
RFC 822 [5] for transferring objects. Both messages may include
optional header fields (a.k.a. "headers") and an object body. The
object body is separated from the headers by a null line (i.e., a
line with nothing preceding the CRLF).

4. Usage of RFC 822 and MIME Constructs

Like HTTP/1.0, CCI++ reuses many of the constructs defined for
Internet Mail (RFC 822, [5]) and the Multipurpose Internet Mail
Extensions (MIME, [4]) to allow Object's to be transmitted in an
open variety of representations. Please refer to the HTTP/1.0
draft for the constraints HTTP/1.0 does not obey, and further
information about various formats and types.

5. Request

A request message from a client to a server includes, within the
first line of that message, the method to be applied to the object
requested, the identifier of the object, and the protocol version
in use. For backwards compatibility with the more limited HTTP/0.9
protocol, there are two valid formats for an HTTP request:

Request =3D Simple-Request | Full-Request

Simple-Request =3D "GET" SP URI CRLF ; HTTP/0.9
request

Full-Request =3D Request-Line ; see Section 5.1
General-Header ; see Section 4.3
Request-Header ; see Section 5.5
Object-Header ; see Section 7
CRLF
[ Object-Body ] ; see Section 3.2

If an CCI++ server receives a Simple-Request, it must respond with
an CCI++ Simple-Response. Similarly, if a client receives a
response that does not begin with a Status-Line, it should assume
that the response is a Simple-Response and parse it accordingly.

Please refer to HTTP/1.0 draft for detailed information about
different types of requests as well as definitions and formats of
version information, URI, and request header fields.

5.1 Request-Line

The Request-Line begins with a method token, followed by the URI
and the protocol version, and ending with CRLF. The elements are
separated by SP characters. No CR or LF are allowed except in the
final CRLF sequence.

Request-Line =3D Method SP URI SP HTTP-Version CRLF

5.2 Method

The Method token indicates the method to be performed on the
resource identified by the URI. The method is case-sensitive and
extensible.

Method =3D "GET" | "DISPLAY" | "POST"
| "SEND" | "DISCONNECT" | "QUIT"
| "CONFIGURE" | "EVENT" | "UPDATE"
| extension-method
extension-method=3Dtoken

Note that all CCI methods are kept for backward compatibility.
Those methods which not available in CCI are CONFIGURE, EVENT, and
UPDATE.

The methods SEND, DISCONNECT, CONFIGURE, EVENT, UPDATE, must be
supported by all conforming CCI++ servers. The list of methods
acceptable by a specific resource can be specified in an "Allow"
Object-Header (HTTP/1.0 draft Section 7.1). However, the client is
always notified, through the return code of the response, whether
a method is currently allowed on a specific resource, as this can
change dynamically. The set of common methods for CCI++ is
described below. Although this set can be easily expanded,
additional methods cannot be assumed to share the same semantics
for separately extended clients and servers. Servers should return
the Status-Code "501 Not Implemented" if the method is unknown.

5.2.1 GET

GET ( URL | URN ) <url> [ OUTPUT ( CURRENT | NEW | NONE ) ]

As in HTTP, the GET command tells the browser to resolve the given
URL. The resulting object will be either be displayed by the
browser (when the output option is CURRENT or NEW) or returned to
the external application that issued the request (when output
option is NONE). Location of where the output is displayed is
optional with the default being the current window.

If the output option NONE is specified then the URL will not be
displayed by the browser, rather it will be returned directly to
the external application. This allows an external program which
does not understand HTTP to use the browser for data retrieval.

5.2.2 POST

POST <host>:<port>
Content-Length: (length of MIME body)
..MIME body...

The POST command tells the browser to forward the following MIME
body to the specified HTTP server (<host>:<port>). It serves the
original purposes of the POST method in HTTP:

o Annotation of existing documents;

o Posting a message to a bulletin board topic, newsgroup,
mailing list, or similar group of articles;

o Providing a block of data (usually a form) to a data-handling
process, or a script which can be run by such a process;

o Extending a document during authorship.

A successful POST does not require that the object be created as a
resource on the origin server or made accessible for future
reference. That is, the action performed by the POST method might
not result in a resource that can be identified by a URI. In this
case, a "200 OK" is the appropriate Status-Code returned in the
response. If a resource has been created, "201 Created" should be
the response.

Note: The user agent may not assume any postconditions of the
method in terms of web topology. For example, if a POST is
accepted, the effect may be delayed or overruled by human
moderation, batch processing, etc. The user agent should not rely
on the resource being immediately (or ever) created.

5.2.4 SEND

The SEND method is used to transmit information from the external
application to the browser and vice-versa. To be backward
compatible with CCI, this method has several variations:

(a) SEND DATA

SEND DATA [ ( CURRENT | NEW ) ]
Content-Type: data type
Content-Length: (length of data body)
..MIME body...

The SEND DATA command transmits data from the external application
to the browser which then displays the data. The data message
begins following the <CRLF> after the Content-Length line.

The optional output (CURRENT|NEW) suggests to the browser where to
display the document. With the CURRENT option, the document should
appear in the active window, while NEW would specify that a new
window should be created to display the data. If output is not
specified the default is to display in the current active window.

(b) SEND ANCHOR
SEND ANCHOR STOP

This message informs the browser to send all activated URL anchors
to the external application. The browser is intended to continue
sending activation messages until a SEND ANCHOR STOP message is
sent. Output from the browser to the external application will be:

301 ANCHOR <url>

(c) SEND OUTPUT <type>
SEND OUTPUT STOP <type>

This message informs the browser to send all subsequent objects
with the specified MIME <type> to the external application. The
recipient will then be responsible for displaying the output.

Output from the browser.

306 Viewer output
Content-Type: data type
Content-Length: (length of data body)
... data body ...

5.2.5 DISCONNECT

In CCI, this method notifies the browser that connecting
applications will be exiting. CCI++ requires the external
application to understand this command as well (e.g. the browser
may try to shutdown the connection when it is being manually
terminated by a user).

5.2.6 QUIT

In CCI, this request tells the browser to shutdown. It is intended
for clean up when the browser is used as a slave process and the
master application is exiting or will no longer be using the
browser. QUIT will also be used for the browser to shutdown the
external application in CCI++ because in a distributed compound
document system, the user interacts primarily with the browser, and
the browser should be able to terminate the external application or
data handler when it no longer needs the particular type of data-
handling service.

5.2.7 CONFIGURE

CONFIGURE is used by the external application to inform the browser
of specific actions of which it wishes to be informed. It has the
following forms:

(a) CONFIGURE EVENT CLEAR ( BUTTON | KEY | AREA | ALL )
CONFIGURE EVENT ( ADD | REMOVE ) ( BUTTON | KEY | AREA ) <p>
<ap>

CONFIGURE EVENT is used by an external application to inform the
browser of the event(s) in which it is interested; the optional
CLEAR token is used for reset. The type, parameter, and
additional_parameter can be one of the following:

type parameter additional parameter
=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D=3D =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
BUTTON ( UP | DOWN | MOTION ) <button code>
KEY ( UP | DOWN ) <key code>
AREA ( EXPOSE
| HIDE
| DESTROY
| RESIZE
| ENTER
| LEAVE )

Output may be:

200 OK ; events successfully cleared or registered

Any registered events that occur subsequent to the CONFIGURE EVENT
request will be sent to the external application asynchronously.
For BUTTON and KEY events, modifier information is included in the
message; allowable modifier states are: SHIFT, CAPS, CONTROL, and
META. Output may be:

306 Viewer event
AREA EXPOSE
306 Viewer event
BUTTON DOWN 1 SHIFT META
306 Viewer event
KEY DOWN c CONTROL
306 Viewer event
BUTTON MOTION 200 300 SHIFT META

(d) CONFIGURE METHOD ( GET | POST | PUT ) ( NOTIFY | START | STOP )

CONFIGURE METHOD is used by an external application to capture the
browser's actions on different anchor events. The external
application can choose to be simply notified or to have complete
control (i.e. stop the browser from processing all anchor events)
over all anchor events. The anchor events are forwarded to the
external application until it issues a STOP command of the same
type.

type additional_parameter
=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D
GET NOTIFY, START, or STOP
POST NOTIFY, START, or STOP
PUT NOTIFY, START, or STOP

Output may be:

307 Viewer method
GET URL http://www.foo.com/blech?200,300
307 Viewer method
POST URL nntp://news.foo.com/comp.infosystems?follow-up-
to=3D1234
Content-type: body type
Content-length: (length of MIME body)
... MIME body...

5.2.8 EVENT

The EVENT method is designed for the browser to send the event(s)
an external application is interested in when it/they occur(s).

e.g. EVENT CLIENT_AREA DESTROYED
EVENT KEY_UP XK_p
EVENT BUTTON_MOVE 3

5.2.9 UPDATE

The UPDATE command tells the data handler/browser to give the most
up-to-date output (e.g. recompute when there is a GUI state change
or refresh when the image in the drawing area window does not
correspond to that in the buffer).

6. Response

If the client has issued a CCI++ request, the response from the
server shall consist of the following:

Response =3D Simple-Response | Full-Response

Simple-Response=3D [Object-Body]

Full-Response =3D Status-Line ; see Section 6.1
*General-Header ; see Section 4.3
*Response-Header ; see Section 6.4
*Object-Header ; see Section 7
CRLF
[ Object-Body ] ; see Section 3.2

6.1 Status-Line

The Status-Line consists of the protocol version followed by a
numeric status code and its associated textual phrase, with each
element separated by SP characters. No CR or LF is allowed except
in the final CRLF sequence.

Status-Line =3D CCI++-Version SP Status-Code SP Reason-Phrase
CRLF

6.2 CCI++ Version

The CCI++-Version element identifies the protocol version being
used by the server. The format of this field is identical to the
corresponding HTTP-Version field in the Request-Line described in
HTTP/1.0 draft Section 5.3.

6.3 Status Codes and Reason Phrases

The Status-Code element is a 3-digit integer result code of the
attempt to understand and satisfy the request. The Reason-Phrase is
intended to give a short textual description of the Status-Code.
The Status-Code is intended for use by automata and the Reason-
Phrase is intended for the human user. The client is not required
to examine the Reason-Phrase, nor to pass it on to the human user.

Status-Code =3D 3DIGIT

Reason-Phrase =3D token *( SP token )

All responses, regardless of the Status-Code, may contain an Object-
Header and/or an Object-Body. This can either be the object pointed
to by the requested URI or an object containing further explanation
of the Status-Code. In the latter case, the preferred media type is
"text/html", but "text/plain" is also acceptable.

The first digit of the Status-Code defines the class of responses
known to HTTP. The last two digits do not have any categorization
role. There are 5 values for the first digit:

o 1xx: Not used, but reserved for future use

o 2xx: Success - The requested action was successfully received
and understood

o 3xx: Redirection - Further action must be taken in order to
complete the request

o 4xx: Client Error - The request contains bad syntax or is
inherently impossible to fulfill

o 5xx: Server Error - The server could not fulfill the request

The values of the numeric status codes and an example set of
corresponding Reason-Phrase's are presented below. Every Status-
Code has a description of which method it can follow and any
metainformation required in the HTTP-header (refer to HTTP/1.0
draft).

6.3.1 Successful 2xx

This class of status codes indicates that the client's request was
successfully received and understood.

200 OK
201 Created
202 Accepted

Please refer to HTTP/1.0 draft for detailed information about the
2xx class messages (and messages 203-206).

6.3.2 Redirection 3xx

This class of status codes indicates that further action needs to
be taken by the client/external application in order to fulfill
the request.

CCI++ may not need all the redirection messages (301-304) in
HTTP/1.0, interested readers may refer to HTTP/1.0 draft for
further details.

305 Acceptable options

This is required by CCI++ for negotiation between the browser and
an external application.

6.3.3 Client Error 4xx

The 4xx class of status codes is intended for cases in which the
client seems to have erred. The codes can follow any method
described in Section 5.2, and the set consists of:

400 Bad Request
401 Unauthorized
402 Payment Required
403 Forbidden
404 Not Found
405 Method Not Allowed
406 None Acceptable
407 Proxy Authentication Required
408 Request Timeout

Please refer to HTTP/1.0 draft for detailed information about the 5
class messages.

6.3.4 Server Errors 5xx

Response status codes beginning with the digit "5" indicate cases
in which the server is aware that it has erred or is incapable of
performing the request. These codes can follow any method at any
time.

Note: For all of the 5xx codes, the server is encouraged to send
back a CCI++-header and an Object-Body containing an explanation of
the error situation, and whether it is a temporary or permanent
condition.

500 Internal Server Error
501 Not Implemented
502 Bad Gateway
503 Service Unavailable
504 Gateway Timeout

Please refer to HTTP/1.0 draft for detailed information about the
5xx class messages.

6.4 Response Header Fields

Please refer to HTTP/1.0 draft.

7. References

[1] C. Ang, D. Martin, M. Doyle. "Integrated Control of
Distributed Volume Visualization through the World-Wide-Web." IEEE
Visualization '94 Conference Proceedings, 13-20, October 1994.
<URL: http://128.218.33.80/public/Projects/Applets/Abstract.html>

[2] T. Berners-Lee. "Universal Resource Identifiers in WWW: A
Unifying Syntax for the Expression of Names and Addresses of
Objects on the Network as used in the World-Wide Web." RFC 1630,
CERN, <URL:http://ds.internic.net/rfc/rfc1630.txt>, June 1994.

[3] T. Berners-Lee, L. Masinter, and M. McCahill. "Uniform
Resource Locators (URL)." Internet-Draft (work in progress), CERN,
Xerox PARC, University of Minnesota, <URI:http://ds.internic.net/
internet-drafts/draft-ietf-uri-url-08.txt>, October 1994.

[4] N. Borenstein and N. Freed. "MIME (Multipurpose Internet Mail
Extensions) Part One: Mechanisms for Specifying and Describing the
Format of Internet Message Bodies." RFC 1521, Bellcore, Innosoft,
<URL:http://ds.internic.net/rfc/rfc1521.ps>, September 1993.

[5] D. H. Crocker. "Standard for the Format of ARPA Internet Text
Messages." STD 11, RFC 822, UDEL,
<URL:http://ds.internic.net/rfc/rfc822.txt>, August 1982.

------- End of Blind-Carbon-Copy