A query expression (also called search query in the following) defines the criteria according to which the search module of the Infopark Search Server performs a search. A query expression consists of search words, operators, and modifiers:
Operators and modifiers are keywords that are used in search expressions to define the relationship between search words. Only documents in which the search words have the specified relationship can show up in the search result. For example, the query expression
cherries <#AND> <#NOT> apricots
searches for documents containing the search words
cherries but not
Keywords should always be enclosed in angle brackets. If the Search Engine is operated with a non-English localisation, keywords must always be prefixed with a Pound sign (or hash mark, depending on your keyboard’s layout). This is not necessary if the English locale is used.
With operator and modifier names the Search Engine does not distinguish between uppercase and lowercase letters.
The Search Engine is equipped with three so-called parsers which analyze the user’s search queries and perform the corresponding search actions. The parsers differ from each other in the way they analyze and interpret query expressions. Each of the three parsers was designed for a certain field of application. The parser to be used can be specified in XML search query requests to the Search Engine Server. Therefore, a more or less complex query language can be provided in query forms, depending on the target user group.
A parser should not be equated with the query syntax it supports. It is, for example, possible to use the simple syntax as well as explicit syntax in the simple parser. Furthermore, explicit syntax allows you to use several notations.
The simple parser is a universal one because it can be used for queries in simple as well as in explicit syntax. The parser is favoured in environments in which the users want to get the best results with the least effort.
The simple parser converts simple queries to explicit queries, adding
operators where it seems appropriate. Each search word is implicitly
preceeded with the
MANY and the
MANYcauses the document’s relevance to grow as the density of a word’s occurrence in a document grows. Density is a relative measure that specifies the relationship between the number of occurrences of a search word and the amount of text it contains.
STEMcauses the search to include also words that are variations of the search word.
If a user enters search words separated with commas, then the words are
combined with the
ACCRUE operator. This operator causes a
document’s relevance to grow as the absolute number of occurrences of the
search word grows. A query such as
apple, banana, orange
is therefore converted by the parser as follows:
<#accrue>(<#many><#stem>apple, <#many><#stem>banana, <#many><#stem>orange)
The explicit parser is limited to processing search expressions written in explicit syntax. It was designed for environments in which search queries are generated under control of a software program. In such environments, instead of typing operators, users use checkboxes, for example, to select the operations to be applied to the search words that have been entered.
Search queries in explicit syntax can be made using prefix or infix notation:
<#OR> (a, <#AND> (b, c))retrieves documents containing the search word
aor a combination of
ORoperators of which
ORhas less precedence. As a consequence, search words combined with
ANDare processed before those combined with
a <#AND> b <#OR> csearches for documents containing
aas well as
The following query is an example for prefix notation:
<#paragraph>("vehicle", <#sentence>("safety", <#phrase>("no", "compromise")))
Using infix notation, this query is stated as follows:
"vehicle" <#paragraph> "safety" <#sentence> "no" <#phrase> "compromise"
When you enclose individual words in double quotation marks, the Autonomy search engine interprets those words literally. For example, by entering the word "film" explicitly in double-quotation marks, the words "films", "filmed", and "filming" will not be considered in the search:
The quotation marks are a syntactical element of the explicit syntax and can be used in the simple and the explicit parser. The following example retrieves documents that contain both the literal phrase "pharmaceutical companies" and the literal word "stock”:
AND ("pharmaceutical companies", "stock")
The following example retrieves documents containing the phrase "black and white":
<PHRASE> (black "and" white)
PHRASE operator does require angle brackets, and the
"and" is enclosed in double quotation marks because it is to be
interpreted as a literal word, not as an operator.
Additionally, when you enter a topic name enclosed in double quotation marks, the search engine will interpret the topic name as a literal word instead of a topic. This is useful if you want to search for a word that is the same as the name of a topic.
The free text parser allows you to make search queries that equal sentences or part of a sentence. It treats all text as a series of search words. As a consequence, operators will not be identified.
The free text parser generates from a search query a request in explicit syntax by removing unimportant words like articles, prepositions, and conjunctions from the query and combining the search words, resulting in a sequence of words. The parser allows users to make queries in the form of short questions.
FREETEXT operator, the free text functionality
is also available in the simple and explicit parsers.