>I have to agree with Nick. Last year I developed a gateway to a UniSQL
>database, to which Verity's "Photo Search Demo" bears many similarities.
>A prototype is usually running at:
>
> <URL:http://abelard.mit.edu/cgi-bin/museum-entrance/>
>
>Unfortunately, it lacked a full-text engine and the usefulness of
>such a feature was quickly obvious. (Although several people have
>been interested in buying the capability from us anyway.)
>
>We are currently developing server technology that integrates database
>and full-text engines for searching indexes of both local and remote
>data. The current design DOES use Sybase. However, it is also cgi-based.
>I would be interested in hearing more discussion of "server dream
>features" that require a non-cgi approach. What did you have in mind?
First, I should clarify that I'm suggesting that a combination of
relational and full-text (inverted index, etc.) models does more than
either one alone can do. They're optimized for quite different performance
characteristics.
I didn't mean to imply that our engine, or anyone else's, can achieve the
kind of field search performance that a relational database does.
One of our guys, Dave Glazer, wrote a paper for the WWW conference, in
which he talks about why we chose to build an integrated server instead of
going the CGI route. It's at
<URL:http://www.ncsa.uiuc.edu/SDG/IT94/Proceedings/HCI/glazer/glazer.html>.
One of our competitors in the Web arena built a server by linking their
search engine to HTTPd via CGI scripts written in Perl. It got them to
market quickly, but I can't imagine that Perl is going to perform or scale
the way that a C coded integrated server will. I'd be a bit nervous about
basing a commercial server on Perl, in any event, just because it's not a
real commercial product itself.
Of course, I'm not assuming that they're not busy re-coding in C.
As far as ideas, we'd love to be able to directly work with the database
files, but that's a hard one to crack. We're more likely to adopt some
sort of brokering approach, where queries are executing in parallel, rather
than passing through gateways.
Nick