Oracle interMedia provides full-text indexing of data in an Oracle
database. For an Oracle-backed website, interMedia allows users to
search the content of the site by keywords, a key feature for most web
services. For programmers, interMedia provides a tightly coupled SQL
interface for writing search queries.
The current Site-wide Search module in the Arsdigita
Community System (ACS) uses interMedia to index text in an arbitrary
number of tables. This module makes it easy to index additional data
in any tables in the data model, demonstrating the power of
interMedia.
Unfortunately, if interMedia is not configured correctly from the
start, it is a nightmare to administer. This article presents an
overview of our implementation of site-wide full-text search with
Oracle interMedia for http://www.guidestar.org and http://www.volunteersolutions.org,
based on our experience with what does and does not work in an
interMedia installation. In particular, we focus on maintaining the
interMedia indexes so that they are always up-to-date and never
break. We do not address setup/configuration details which are fairly
well-documented (see http://technet.Oracle.com/products/intermedia/htdocs/text_techover.html). In
most cases, installation is not difficult and requires you to source a
few files in the $ORACLE_HOME/ctx directory and to configure Net8.
Alternatives to Oracle interMedia
Full-text search products can be placed into two broad categories:
- Stand-alone software primarily used to search static Web sites
or flat files. One example is PLS (http://www.pls.com), which requires
database users to export their data to the file system, and then to
index them using PLS software.
- Existing Web sites that offer their search technology as a
service for other web sites. One example is AltaVista Search (http://solutions.altavista.com/)
which provides out-of-the box flat file searching and a developer's
kit to integrate with a database or other source of data.
When using the Oracle RDBMS, interMedia provides the following advantages:
- Faster synchronization of changing table data with the
full-text index. Because interMedia is part of Oracle, we can provide
immediate, or very rapid, synchronization.
- Tight SQL integration, keeping the query format similar in
structure to SQL. This integration allows new users to learn the query
language more easily.
- Potential for better scoring by indexing only the text we
need. We can easily "tell" interMedia to index only the data in which
we are interested, ignoring much of the noise apparent in text found
in web pages (e.g. HTML used for layout).
- Any application that talks to Oracle can take advantage of
the existing interMedia indexes, making it easy for external
applications to execute the same queries as the production web site.
Overview
We implemented site-wide full-text search on both GuideStar.org, a site that lists
financial information on over 700,000 American non-profit
organizations, and VolunteerSolutions.org, a
much smaller site that provides volunteers with information about local
opportunities. Both of the sites frequently update the information
associated with each record, requiring us to constantly update the
indexes used for full-text searching. Most of the code samples we
present in this article are taken from the Volunteer Solutions site as
the logic is simpler and the code is easier to understand.
In both sites, the data we wanted to index was spread out among a few
tables and columns. The implementation of full-text search consists
of the following major steps:
- Synchronize the data from all the sources into one central
table (the process of denormalizing the data to index)
- Synchronize the interMedia index or indexes with the denormalized data
Step One: Synchronizing the data to index
We need to synchronize the data we want to index as we have multiple
sources (e.g. the data that we want to index is dispersed through
several tables). The first step is to create a central, site-wide
search table. We need to include at least one column containing all
the text we want to index and we should include extra information that
we will use when presenting results to the user. Placing this
information in the same table is important for the following reasons:
- interMedia will compute statistics about all the text we are
indexing. By using one table to store everything, we are able to
compare results from various queries. If we stored the data in
separate tables, we would have no statistics or information by which
to make a solid comparison.
- Searching one table will be faster than searching through
two joined tables. Though performance is not always a concern, it is a
nice side-effect of maintaining a site-wide search table.
-- VolunteerSolutions.org site-wide search table
create table search (
-- the primary key points back to the detailed opportunity pages
opp_id integer
constraint search_opp_id_pk primary key
constraint search_opp_id_fk references vs_opp,
-- the agency that is sponsoring this volunteer opportunity
agency_id constraint search_agency_id_fk references vs_agency
constraint search_agency_id_nn not null,
-- basic user group information to save a join with user groups!
group_id constraint search_group_id_fk references user_groups,
parent_group_id constraint search_parent_group_id_fk references user_groups,
-- different ways of looking at the sponsoring agency's name
agency_name varchar(150),
agency_name_sort_key varchar(150),
agency_name_initial char(1),
-- where is this opportunity located?
city varchar(100),
state varchar(2),
-- first 5 digits of the zip code
zip varchar(5),
-- is this a one-time or ongoing opportunity
one_time_p char(1),
-- latitude and longitude for zipcode/radius searching
latitude number(9,6),
longitude number(9,6),
-- different ways of looking at the name of this volunteer opportunity
headline varchar(200),
headline_sort_key varchar(200),
headline_initial varchar(200),
-- is this opportunity private to the specified group_id?
private_p char(1),
-- foreing URL describing the opportunity (or agency) in more detail
url varchar(100),
-- the text we want to index
keywords clob,
-- a description of the opportunity to show users
description clob,
-- one clob for each category type (taken from the categories table)
-- this lets us search for keywords within a category. Note that
-- we could have created the same functionality using interMedia's
-- Section Groups
category_group clob,
category_interest clob,
category_skill clob,
category_time clob,
category_interaction clob
);
Next, we need to collect the data we want to index from within the
data model. On both GuideStar and Volunteer Solutions, we wanted to
index text stored in approximately ten columns throughout the data
model. We created PL/SQL procedures to combine the data we wanted to
index from all the tables into the single column named
keywords
in the table search
. The following
code sample is the portion of the PL/SQL procedure from
VolunteerSolutions.org that updates most of the columns in the
search
table.
update search
set agency_id=v_agency_id,
group_id=v_group_id,
parent_group_id=v_parent_group_id,
zip=v_zip,
city=v_city,
state=v_state,
one_time_p=v_one_time_p,
latitude=v_latitude,
longitude=v_longitude,
headline=v_headline,
headline_sort_key=v_headline_sort_key,
headline_initial=v_headline_initial,
url=v_url,
agency_name=v_agency_name,
agency_name_sort_key=v_agency_name_sort_key,
agency_name_initial=v_agency_name_initial,
private_p=v_private_p,
keywords=empty_clob(),
description=empty_clob(),
category_group=empty_clob(),
category_interest=empty_clob(),
category_skill=empty_clob(),
category_time=empty_clob(),
category_interaction=empty_clob()
where opp_id=v_opp_id;
-- now update all the lobs
select keywords, description, category_group, category_interest,
category_skill, category_time, category_interaction
into v_keywords, v_description, v_category_group, v_category_interest,
v_category_skill, v_category_time, v_category_interaction
from search where opp_id=v_opp_id for update;
-- pull out the description (also a lob)
select description into v_description_temp from vs_opp where opp_id=v_opp_id;
-- set up the new keywords
dbms_lob.writeappend(v_keywords, length(v_buffer), v_buffer);
dbms_lob.writeappend(v_keywords, 1, ' ');
dbms_lob.append(v_keywords, v_description_temp);
dbms_lob.append(v_description, v_description_temp);
Once the site-wide search table is populated, we need to ensure that
future inserts, updates, and deletes to the underlying data are
reflected in the information kept in the site-wide search table. There
are two basic approaches to synchronizing the data stored in the
site-wide search table:
- Immediately update the data
- Mark the row in the site-wide search table for later update
Approach 1: Immediately update the data
If the amount of work we need to do to update an individual row of
data in the site-wide search table is small, immediately updating the
data in the site-wide search table makes sense. To implement immediate
updates, create row-level triggers on all the tables Oracle has to
access when it updates the corresponding data in the search
index. Whenever an underlying table is modified, a trigger will fire,
immediately updating, or removing, the information in the site-wide
search table. We use this approach in ACS to keep static-pages
up-to-date:
create or replace trigger static_pages_sws_insert_tr
after insert on static_pages for each row
WHEN (NEW.index_p = 't')
BEGIN
-- we have to create a new row in the index table for this row.
insert into site_wide_index
(table_name, the_key, one_line_description, datastore)
values
('static_pages', :new.page_id, :new.page_title, 'a');
END;
One rule of thumb for deciding what constitutes a small amount of work
is to count the number of rows in all tables that must be accessed to
update the corresponding row in the site-wide search table. If you
only need to access one or two rows, you should probably use this
method.
Approach 2: Mark the row for later update
The second approach relies on triggers to mark a row in the site-wide
search table as dirty. A scheduled process will later update the data
in the row. This is advantageous as it allows for batch processing of
dirty rows which may decrease the total time needed to update the
rows. By updating the index in batches, we can improve the overall
quality of the index, since interMedia will reduce index fragmentation
by grouping similar words together during the index process.
There are a couple of ways to store the dirty bit:
- Bad: Store a
dirty_p
column in the site-wide
search table. Though this is a simple solution, it will force a write
to the site-wide search table, decreasing overall performance. The key
to maintaining top search performance is to ensure that any
book-keeping we do is separate from the data associated with our
site-wide-index. Since the dirty_p
column is in the same
tablespace and datafile as the rest of the site-wide search table,
updating this column will force an unnecessary disk access to the same
drive that holds your site-wide search table.
- Good: Create a table to store a list of rows that need to be
updated. One way to do this is to create the following table:
create table search_dirty_rows (
on_which_table varchar(50) not null,
on_what_id integer not null,
constraint sdr_which_table_what_id_pk
primary key(on_which_table, on_what_id),
creation_date date default sysdate
);
Note that you may consider removing the primary key
constraint to improve write performance to this table. We recommend
the constraint as it leads to cleaner code to update the corresponding
data in the search index as we do not neet to worry about updating the
same data twice.
This solution also lends itself to a simple pipeline for keeping
the data current:
- Write triggers for each table on which the site-wide
search table depends. Example:
create or replace trigger vs_agency_tr
before insert or update on vs_agency
for each row
DECLARE
v_exists_p integer;
BEGIN
select decode(count(*),0,0,1) into v_exists_p
from search_dirty_rows
where on_which_table='vs_agency'
and on_what_id=:new.agency_id;
if v_exists_p = 0 then
BEGIN
-- We have to catch the case where another row is created
-- between the time we've counted and are going to insert
-- Note that the exception means the row already exists
-- so we don't have to do anything if we catch the exception
insert into search_dirty_rows
(on_which_table, on_what_id)
values
('vs_agency', :new.agency_id);
EXCEPTION WHEN OTHERS THEN NULL;
END;
end if;
END;
- Write one procedure that knows how to update the site-wide
search table for every table that has a trigger on it (this is the
same procedure used to initially populate the site-wide search table).
- Regularly update the dirty rows.
The method using search_dirty_rows
leads to code that is
easier to understand and simpler to maintain. It is also easy to add
data from new tables to the site-wide search table, since the procedures
to keep the data up-to-date are independent.
Step Two: Synchronize the interMedia indexes
Now that we have created a site-wide search table, we need to tell
interMedia what text to index. interMedia's datastore class is
responsible for translating any data into text that can be indexed
(the data itself can be stored as a blob, varchar2, or clob with clob
being the default). The datastore class is made up of the following
objects:
- Direct datastore - directly uses the text from one column in a table.
- User datastore - invokes a procedure for each row that you are
indexing. This procedure returns the text to index based on a
rowid.
- Detail datastore - combines multiple rows into the text to index.
- URL datastore - fetches the URL stored in the column and
indexes the returned text.
- File datastore - indexes the contents of the specified file.
The direct and user datastore objects are the most useful for
site-wide full-text searching. Using the direct datastore object may
require additional storage space as we store a duplicate copy of the
text to index in a column of our site-wide search table. The primary
advantage of the direct datastore is that we can simply query the
table to see what text is being indexed. With the user datastore
object, we have to execute the specified procedure to see what text is
being indexed.
The example of the site-wide search table presented in this article
uses the direct datastore object by directly storing the text to index
in the keywords
column of the table search
.
To index the text in this column, we execute:
SQL> create index search_keywords_im_idx
on search(keywords)
indextype is ctxsys.context;
This command creates an interMedia index on
search(keywords)
. Note that the "im" in the name of the
index simply lets us quickly see that this is an interMedia index. The
amount of time it takes to create the index depends directly on the
hardware you are using and the amount of data that you are
indexing. On GuideStar, it takes about 2 1/2 hours to index
approximately 1gb of data on a Sun E450 with 4 processors and 4gb of
RAM. Note that in order to create interMedia indexes, the oracle
user must have been granted the ctxapp
role.
Most of the time, interMedia does a good job of creating
indexes. If you get errors during the build process, either you have
not properly installed interMedia or your index needs to be
dropped. Most of the errors we have seen have come from Net8
configuration errors (see http://oradoc.photo.net/ora816/network.816/a76933/preface.htm).
Performance with interMedia indexes varies widely, primarily depending
on how common the query is. For example, a rarely used query such as
"guidestar" will return quickly whereas a common word such as
"philanthropy" will take significantly longer:
SQL> select count(*)
from search
where contains(keywords, 'guidestar', 1) > 0
order by score(1);
2 3 4
COUNT(*)
----------
5
Elapsed: 00:00:00.02
SQL> select count(*)
from search
where contains(keywords, 'philanthropy', 1) > 0
order by score(1);
2 3 4
COUNT(*)
----------
32517
Elapsed: 00:00:01.99
Once we have built the interMedia index, we need to keep it
synchronized with the data in the site-wide search table. Unlike
Oracle B-tree indexes, interMedia indexes are not updated during the
execution of SQL transactions. There are two ways to update the
interMedia indexes:
- Run Oracle Context Server (
ctxsrv
)
- Manually alter the indexes
Method One: Oracle Context Server
Oracle Context Server immediately updates an interMedia index after
any change to the underlying data. This process frees the user from
having to think about updating the interMedia indexes as the index
updates are transparent. To start context server, execute the
following from the Unix shell:
> ctxsrv -user ctxsys/ctxsyspwd -personality M -log $ORACLE_HOME/ctx/ctxsys.log &
where ctxsyspwd
is the password for the Oracle user ctxsys
.
There are two major problems with Oracle Context Server:
- Immediately updating intermedia indexes leads to greater
fragmentation and thus poorer performance.
- Oracle Context Server consistently crashed every few days when we ran it on GuideStar.
Method Two: Manually update the index
The second method is to manually update the index on a regular
basis. The simplest way to update an interMedia index is to open a
SQL*Plus session as the user who created the index. Then:
SQL> alter index search_keywords_im_idx rebuild online
parameters('sync memory 45M');
Index altered.
The parameters of this call:
"Manually" updating the index is the preferred method. The best way to
do this is to run a script every hour (or however often you want to
update your index) that executes the appropriate ALTER INDEX
statements. The script should run from a robust environment that is
relatively safe from interruptions such as a Unix cron
job or as an
Oracle DBMS job.
We maintain the interMedia indexes on GuideStar and Volunteer
Solutions with ad-context-server.pl, a small perl script that we run
as a cron
job (see http://software.arsdigita.com/bin/ad-context-server.pl). This
script uses Perl DBI and DBD-Oracle to connect as the
ctxsys
user, reads a list of all indexes that need to be
updated (SQL> select PND_INDEX_NAME, count(*) from
ctx_user_pending group by PND_INDEX_NAME;
), and executes the
alter index statements. Using this script, we have run both GuideStar
and Volunteer Solutions with no hitches for over half a year now.
As of Oracle 8.1.6, another safe way to update the index is to submit
a DBMS_JOB
to issue the update statement. The webmail
module in the ACS uses this method:
-- Resync the interMedia index every hour.
DECLARE
v_job number;
BEGIN
dbms_job.submit(v_job,
'ctx_ddl.sync_index(''search_keywords_im_idx'');',
interval => 'sysdate + 1/24');
END;
Updating the index from within an environment that is constantly
restarted, e.g. from within a web server, is a bad idea. On
GuideStar, we initially created a simple procedure that ran every hour
from within the web server and simply executed the alter index
statement. One night we decided to explicitly update the index:
SQL> alter index search_keywords_im_idx rebuild online
parameters('sync memory 45M');
Two hours later, all keyword searches on the Web site were
failing. Users were getting back a page with zero results, regardless
of their query. We exectued another alter index
statement
and got back an error saying that the index was marked as
FAILED/LOADING. Had we been smart, we would have logged what was
going on by executing, as the ctxsys
user:
SQL> exec ctx_output.start_log('better_days');
to log the progress of the alter index
statement. This
command would have logged all errors and notices to
$ORACLE_HOME/ctx/log/better_days
.
You can also look at interMedia errors to find out what went wrong by
opening a SQL*Plus session as the ctxsys
user and
running select err_timestamp, err_index_name, err_text from ctx_index_errors order by err_timestamp;
to get a list of all the errors on interMedia indexes, sorted by date.
The resume
parameter to the alter index
statement is designed to finish updating an index:
SQL> alter index search_keywords_im_idx rebuild online
parameters('resume memory 45M');
The resume statement has never worked for us but maybe you will have
better luck. Probably, you will just see the same error message you had
before. Somehow the index has become corrupted:
SQL> drop index search_keywords_im_idx force;
SQL> create index search_keywords_im_idx on search(keywords)
indextype is ctxsys.context;
And 2 1/2 hours later, the index was re-created and back
on-line. Meanwhile, our Web site was not serving search results for
over three hours.
This error occurred because the hourly procedure running inside our
web server to update the interMedia index started up in parallel to
the one we manually executed in SQL*Plus. These two threads competed
for some resource, and ended up in a deadlock. We have been able to
replicate this same error in two other ways:
- We had a scheduled procedure running in AOLserver updating the
index. We restarted the web server, and ended up with two concurrent alter
index statements running.
- We were running Oracle context server, and executed an alter
index statement from SQL*Plus.
Note that an uncommitted session in SQL*Plus can keep the alter index
statement from completing. Also, the probability of deadlock is
directly related to the amount of time it takes for your index to be
updated. If you update your index once an hour, and the process only
takes a couple minutes to finish, it is unlikely that you will ever
see a deadlock. On GuideStar, some index updates would take a few
hours making it easy to end up with a deadlock.
When manually altering interMedia indexes, you have to make sure that
no more than one alter index
statement is running at any
given time. More than one statement could result in a deadlock inside
interMedia that can only be solved by dropping and recreating the
index.
The people at interMedia tell us that the deadlock should not happen,
and we agree! However, the reality of the deadlock gives us the
following interMedia law: Use exactly one method to update interMedia
indexes.
Known bugs
Oracle 8.1.5 has a bug in the parser for the about
clause. The bug seems to incorrectly map some multi word queries to
their corresponding tokens in the interMedia tables. For example,
about('massachusetts bay')
, which is tokenized as
'Massachusetts Bay'
, is parsed as 'Massachusetts
bay'
, not matching any row in the interMedia tokens table and
thereby throwing an error. If you find yourself in this situation,
you can remove the about clauses in your queries.
We had many problems running interMedia in Oracle 8.1.5 and
8.1.5.1. On GuideStar, we had to restart the Oracle database every 1-2
nights just to maintain an adequate level of performance. We never
managed to fully trace down the problem, but we were getting a lot of
loose processes that simply bogged down the database.We were also
constantly (twice a day at one point!) seeing ORA-600 errors:
00600, 00000, "internal error code, arguments: [%s], [%s], [%s], [%s], [%s], [%s], [%s], [%s]"
// *Cause: This is the generic internal error number for Oracle program
// exceptions. This indicates that a process has encountered an
// exceptional condition.
// *Action: Report as a bug - the first argument is the internal error number
All of these problems went away when we upgraded to Oracle 8.1.6. If
you are going to use interMedia, we do not recommend using any release
prior to Oracle 8.1.6. Oracle has also released patches for 8.1.6 to
address a few initial problems, such as theme-based queries. These
patches should be applied before using interMedia.
General Tuning Considerations
On the web, we are often interested in getting back the first few rows
that match a query as fast as possible (as opposed to generating a
list of all the matches). As with other Oracle SQL queries, interMedia
makes use of the FIRST_ROWS
hint to optimize the query to
return the first rows in the shortest amount of time. See http://technet.Oracle.com/products/intermedia/htdocs/text_techover.html#query_tuning
for more information about query optimization.
As you continue to update your interMedia indexes, the tables
interMedia maintains become more and more fragmented. You may
want to optimize your index by executing the following statement in
SQL*Plus (after removing the ad-context-server.pl from your
crontab
- no sense risking a deadlock!):
SQL> alter index search_keywords_im_idx rebuild online
parameters('optimize fast');
Optimization is an extremely slow process and it is not clear that the
performance gain of an optimized index will be noticeable. If you are
going to optimize, you should limit the amount of time that the
optimization will take to, for example, 180 minutes:
SQL> alter index search_keywords_im_idx rebuild online
parameters('optimize full maxtime 180');
The interMedia documentation (http://oradoc.photo.net/ora816/inter.816/a77061/ind10.htm)
has more information about optimizing indexes.
Another consideration is the layout of the data files used to store
the site-wide search table and the interMedia indexes. Ideally, you
will want to put your denormalized site-wide search table in its own
tablespace on its own mirrored drive and use another tablespace on
another mirrored drive to store the indexes. Separating the indexes
and the data will minimize I/O contention leading to better
performance.
Text parsing - the interMedia query language
To use interMedia full-text search, you must learn interMedia's query
language. Here we present the most commonly used functions in terms of
full-text search on the web.
The basic syntax is to add contains
and
score
clauses to your SQL query. An example is the
simplest way to learn this syntax:
SQL> select org_name, score(1) as the_score
from search
where contains(keywords, 'guidestar', 1) > 0
order by score(1) desc;
ORG_NAME THE_SCORE
-------------------------------------------------- ----------
Philanthropic Research, Inc. 100
Thankful Hearts Food Pantry, Inc. 17
Decatur County Education Foundation, Inc. 17
Wadessa Project Hope 17
Safe Haven Project, Inc. 17
In this example, we are looking for any row that matches the keyword
"guidestar" using the interMedia index on the keywords
column of the table search
. The third argument to the
contains
function, 1
, is simply a
placeholder that matches the argument to the score
function. You can specify multiple keywords in the query string by
separating each keyword with a comma. You can also specify the
relative weights of each of the keywords to affect how the overall
score is computed:
SQL> select org_name, score(1) as the_score
from search
where contains(keywords, 'guidestar*3,apotheosis', 1) > 0
order by score(1) desc;
ORG_NAME THE_SCORE
-------------------------------------------------- ----------
Philanthropic Research, Inc. 58
Thankful Hearts Food Pantry, Inc. 54
Decatur County Education Foundation, Inc. 54
Wadessa Project Hope 54
Safe Haven Project, Inc. 54
A Temple of the Apotheosis, Inc. 25
Indianapolis Art Center, Inc. 5
The interMedia query language support three boolean operators:
AND
, OR
, and NOT
(abbreviated
to &
, |
, and ~
respectively).
Note that the NOT
operator is equivalent
to AND NOT
.
There are a few other useful interMedia operators worth mentioning:
-
$
- This stemming operator matches all words
similar to the one specified. For example, the clause
contains(keywords,'$child',1) > 0
matches all rows
containing the word child or any related words such as children:
SQL> select count(*)
from search
where contains(keywords,'child',1) > 0;
COUNT(*)
----------
198
SQL> select count(*)
from search
where contains(keywords,'$child',1) > 0;
COUNT(*)
----------
590
SQL> select count(*)
from search
where contains(keywords,'$children',1) > 0;
COUNT(*)
----------
590
-
near
- This operator looks for the proximity of
the specified words to each other. The closer the words appear to
each other, the higher the score. For example, if we wanted to find all rows matching youth soccer:
SQL> select count(*)
from search
where contains(keywords,'youth,soccer',1) > 0;
COUNT(*)
----------
86
SQL> select count(*)
from search
where contains(keywords,'youth NEAR soccer',1) > 0;
COUNT(*)
----------
3
-
about
- This operator takes advantage of the theme of each document to expand the number of results returned to the user. For example, using the about
operator on the same query as above returns an additional eight rows:
SQL> select count(*)
from search
where contains(keywords,'about(youth,soccer)',1) > 0;
COUNT(*)
----------
94
Some of the folks at interMedia wrote a PL/SQL function to translate
what humans write to an interMedia query string consisting of the
NEAR
and ABOUT
operators (see the
im_convert
function in http://software.arsdigita.com/www/doc/sql/site-wide-search.sql). This
function is fairly rough, but in many cases will suffice.
Summary
The overall process for implementing site-wide full-text search is as follows:
- Create and maintain one site-wide search table of all the data to index
- Create triggers on the underlying tables that simply mark the
corresponding row in the site-wide search table as dirty
- Regularly update the dirty rows in the site-wide search table
- Create the interMedia indexes you want to use
- Run ad-context-server.pl every hour as a Unix cron job or
submit an Oracle job to keep the interMedia indexes up-to-date. Do not
use any other method to update the index.
Future work
Currently, the most painful flaw in our use of interMedia is the
amount of work it takes to correctly translate a user's input to
interMedia jargon. We need to create a suite of text parsers that can
be used for different situations. The site-wide search module in
ACS continues to improve on the functions used to translate
user-entered text into interMedia query syntax, but there is still a
lot of room for improvement.