A query expression (also called search query in the following) defines the criteria according to which the search module of the Search Server performs a search. A query expression consists of search words, operators, and modifiers:
Operators and modifiers are keywords that are used in search expressions to define the relationship between search words. Only documents in which the search words have the specified relationship can show up in the search result. For example, the query expression
cherries <#AND> <#NOT> apricots
searches for documents containing the search words
cherries
but not apricots
.
Keywords should always be enclosed in angle brackets. If the Search Engine is operated with a non-English localisation, keywords must always be prefixed with a Pound sign (or hash mark, depending on your keyboard’s layout). This is not necessary if the English locale is used.
With operator and modifier names the Search Engine does not distinguish between uppercase and lowercase letters.
The Search Engine is equipped with three so-called parsers which analyze the user’s search queries and perform the corresponding search actions. The parsers differ from each other in the way they analyze and interpret query expressions. Each of the three parsers was designed for a certain field of application. The parser to be used can be specified in XML search query requests to the Search Engine Server. Therefore, a more or less complex query language can be provided in query forms, depending on the target user group.
A parser should not be equated with the query syntax it supports. It is, for example, possible to use the simple syntax as well as explicit syntax in the simple parser. Furthermore, explicit syntax allows you to use several notations.
The simple parser is a universal one because it can be used for queries in simple as well as in explicit syntax. The parser is favoured in environments in which the users want to get the best results with the least effort.
The simple parser converts simple queries to explicit queries, adding
operators where it seems appropriate. Each search word is implicitly
preceeded with the MANY
and the STEM
operators.
MANY
causes the document’s relevance to grow as
the density of a word’s occurrence in a document grows. Density is a
relative measure that specifies the relationship between the number
of occurrences of a search word and the amount of text it
contains.STEM
causes the search to include also words that
are variations of the search word.If a user enters search words separated with commas, then the words are
combined with the ACCRUE
operator. This operator causes a
document’s relevance to grow as the absolute number of occurrences of the
search word grows. A query such as
apple, banana, orange
is therefore converted by the parser as follows:
<#accrue>(<#many><#stem>apple, <#many><#stem>banana, <#many><#stem>orange)
The explicit parser is limited to processing search expressions written in explicit syntax. It was designed for environments in which search queries are generated under control of a software program. In such environments, instead of typing operators, users use checkboxes, for example, to select the operations to be applied to the search words that have been entered.
Search queries in explicit syntax can be made using prefix or infix notation:
<#OR> (a, <#AND> (b, c))
retrieves documents containing the
search word a
or a combination of b
and
c
.AND
and OR
operators of which
OR
has less precedence. As a consequence, search words
combined with AND
are processed before those combined
with OR
. Example: a <#AND> b <#OR> c
searches for documents containing a
as well as
b
, or c
.The following query is an example for prefix notation:
<#paragraph>("vehicle", <#sentence>("safety", <#phrase>("no", "compromise")))
Using infix notation, this query is stated as follows:
"vehicle" <#paragraph> "safety" <#sentence> "no" <#phrase> "compromise"
Literal Text
When you enclose individual words in double quotation marks, the Autonomy search engine interprets those words literally. For example, by entering the word "film" explicitly in double-quotation marks, the words "films", "filmed", and "filming" will not be considered in the search:
"film"
The quotation marks are a syntactical element of the explicit syntax and can be used in the simple and the explicit parser. The following example retrieves documents that contain both the literal phrase "pharmaceutical companies" and the literal word "stock”:
AND ("pharmaceutical companies", "stock")
The following example retrieves documents containing the phrase "black and white":
<PHRASE> (black "and" white)
The PHRASE
operator does require angle brackets, and the
"and" is enclosed in double quotation marks because it is to be
interpreted as a literal word, not as an operator.
Additionally, when you enter a topic name enclosed in double quotation marks, the search engine will interpret the topic name as a literal word instead of a topic. This is useful if you want to search for a word that is the same as the name of a topic.
The free text parser allows you to make search queries that equal sentences or part of a sentence. It treats all text as a series of search words. As a consequence, operators will not be identified.
The free text parser generates from a search query a request in explicit syntax by removing unimportant words like articles, prepositions, and conjunctions from the query and combining the search words, resulting in a sequence of words. The parser allows users to make queries in the form of short questions.
Using the FREETEXT
operator, the free text functionality
is also available in the simple and explicit parsers.