Search Queries

A query expression (also called search query in the following) defines the criteria according to which the search module of the Infopark Search Server performs a search. A query expression consists of search words, operators, and modifiers:


Operators and modifiers are keywords that are used in search expressions to define the relationship between search words. Only documents in which the search words have the specified relationship can show up in the search result. For example, the query expression

cherries <#AND> <#NOT> apricots

searches for documents containing the search words cherries but not apricots.

Keywords should always be enclosed in angle brackets. If the Search Engine is operated with a non-English localisation, keywords must always be prefixed with a Pound sign (or hash mark, depending on your keyboard’s layout). This is not necessary if the English locale is used.

With operator and modifier names the Search Engine does not distinguish between uppercase and lowercase letters.

Parser

The Search Engine is equipped with three so-called parsers which analyze the user’s search queries and perform the corresponding search actions. The parsers differ from each other in the way they analyze and interpret query expressions. Each of the three parsers was designed for a certain field of application. The parser to be used can be specified in XML search query requests to the Search Engine Server. Therefore, a more or less complex query language can be provided in query forms, depending on the target user group.

A parser should not be equated with the query syntax it supports. It is, for example, possible to use the simple syntax as well as explicit syntax in the simple parser. Furthermore, explicit syntax allows you to use several notations.

Simple Parser

The simple parser is a universal one because it can be used for queries in simple as well as in explicit syntax. The parser is favoured in environments in which the users want to get the best results with the least effort.

The simple parser converts simple queries to explicit queries, adding operators where it seems appropriate. Each search word is implicitly preceeded with the MANY and the STEM operators.

  • MANY causes the document’s relevance to grow as the density of a word’s occurrence in a document grows. Density is a relative measure that specifies the relationship between the number of occurrences of a search word and the amount of text it contains.
  • STEM causes the search to include also words that are variations of the search word.

If a user enters search words separated with commas, then the words are combined with the ACCRUE operator. This operator causes a document’s relevance to grow as the absolute number of occurrences of the search word grows. A query such as

apple, banana, orange

is therefore converted by the parser as follows:

<#accrue>(<#many><#stem>apple, <#many><#stem>banana, <#many><#stem>orange)

Explicit Parser

The explicit parser is limited to processing search expressions written in explicit syntax. It was designed for environments in which search queries are generated under control of a software program. In such environments, instead of typing operators, users use checkboxes, for example, to select the operations to be applied to the search words that have been entered.

Search queries in explicit syntax can be made using prefix or infix notation:

  • Prefix notation uses parentheses to make an operator’s precedence explicit. For example, the expression <#OR> (a, <#AND> (b, c)) retrieves documents containing the search word a or a combination of b and c.
  • With infix notation the precedence of operators is implicit, i. e. associated with the operators themselves, unless the precedence is modified with parentheses. This applies mainly to the AND and OR operators of which OR has less precedence. As a consequence, search words combined with AND are processed before those combined with OR. Example: a <#AND> b <#OR> c searches for documents containing a as well as b, or c.

The following query is an example for prefix notation:

<#paragraph>("vehicle", <#sentence>("safety", <#phrase>("no", "compromise")))

Using infix notation, this query is stated as follows:

"vehicle" <#paragraph> "safety" <#sentence> "no" <#phrase> "compromise"

Literal Text

When you enclose individual words in double quotation marks, the Autonomy search engine interprets those words literally. For example, by entering the word "film" explicitly in double-quotation marks, the words "films", "filmed", and "filming" will not be considered in the search:

"film"

The quotation marks are a syntactical element of the explicit syntax and can be used in the simple and the explicit parser. The following example retrieves documents that contain both the literal phrase "pharmaceutical companies" and the literal word "stock”:

AND ("pharmaceutical companies", "stock")

The following example retrieves documents containing the phrase "black and white":

<PHRASE> (black "and" white)

The PHRASE operator does require angle brackets, and the "and" is enclosed in double quotation marks because it is to be interpreted as a literal word, not as an operator.

Additionally, when you enter a topic name enclosed in double quotation marks, the search engine will interpret the topic name as a literal word instead of a topic. This is useful if you want to search for a word that is the same as the name of a topic.

Freetext Parser

The free text parser allows you to make search queries that equal sentences or part of a sentence. It treats all text as a series of search words. As a consequence, operators will not be identified.

The free text parser generates from a search query a request in explicit syntax by removing unimportant words like articles, prepositions, and conjunctions from the query and combining the search words, resulting in a sequence of words. The parser allows users to make queries in the form of short questions.

Using the FREETEXT operator, the free text functionality is also available in the simple and explicit parsers.