An indexing request is coded using an ses-indexDoc
element
inside an ses-request
. This is shown in the following example.
It also shows that several requests can be placed into one payload:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE ses-payload SYSTEM "http://www.example.com/ses.dtd"> <ses-payload payload-id="B42TE241" timestamp="20100825172100" version="2.1"> <ses-header> <ses-sender sender-id="FX45RTDT" name="CM-Server"/> <ses-authentication login="cm-server" password=""/> </ses-header> <ses-request request-id="BR12TI5X"> <ses-indexDoc docId="4712" collection="collection1" mimeType="application/ms-word" usesStreaming="YES"> <title encoding="plain">testdoc1</title> <customAttribute encoding="base64">MGHX2c5=</customAttribute> <blob encoding="stream">d--1157180779-000000001-X</blob> </ses-indexDoc> </ses-request> <ses-request request-id="BR12TI5Y"> <ses-indexDoc docId="4713" collection="collection2"> <title encoding="plain">testdoc2</title> ... <blob>Just the blob</blob> </ses-indexDoc> </ses-request></ses-payload>
The ses-indexDoc
element has the following attributes:
docId
collection
mimeType
usesStreaming
NO
.The ses-indexDoc
element contains as subelements all the
attributes listed in section Content Indexing. Of
these attributes only title
, the custom attribute
customAttribute
, and blob
were used in the
example above. The encoding of the contents of the object and content
attributes to be indexed is specified using the encoding
tag
attribute in the attribute tags concerned. encoding
can have
one of the following values:
plain
base64
stream
If an attribute is base64-encoded or has been transferred to the Search
Engine Server via the streaming interface, a preprocessor must have been
configured for the MIME type of the document. This preprocessor’s task is
to convert the attribute’s content to plain text and to set the value of
encoding
to plain
.
A client has the possibility to send the contents of attributes to the Search Engine Server in advance, i. e. prior to sending it an indexing request. This procedure is recommendable for large amounts of binary data because it is faster than base64-encoding the data and including it in the request.
A client uses the so-called streaming interface to transfer such data to the Search Engine Server. The streaming interface is addressed by sending a POST request to the HTTP port of the Search Engine Server, specifying /stream
as URL. After the data have been transferred, the client receives a streaming ticket in the response. In the indexing request that follows, the client specifies the ticket ID in the manner described above in order to refer to the data.
The Content Management Server transferrs the contents of generic documents to the Search Engine Server via the streaming interface. This also applies to the body of publication
, document,
and template
objects, if the body is larger than 8 kilobytes. Except for templates, this is also true for the Template Engine (the Template Engine does not send templates to the Search Engine Server for indexing). The minimum amount of data to be transferred via streaming can be configured in the system configuration of the Content Manager and the Template Engine using the minStreamingDataLength
entry.