Version 1.1, 12th January 2004
While the searchRetrieve operation enables searches for a specific term within an index, the scan operation allows the client to request a slice of the available terms at a given point within the list of terms in the index. This enables clients to present an ordered list of values and generally how many hits there would be for a search on that term. Scan is typically used to select terms for subsequent searching or to verify visually a negative search result. It can also be used to detect data errors for correction.
The scan operation has the version, stylesheet and extraRequestData fields which are common to all requests in SRW. These are documented elsewhere and maintain the same semantics in the scan operation. The following parameters are unique to scan.
The index to be browsed and the start point within it is given in the scanClause parameter as a complete index, relation, term clause in CQL. The relation should always be '=', as relations other than equality are meaningless in the scan operation. Relation modifiers on the other hand may be given as for a search, including such modifiers as 'stem', 'string', 'word' or any other modifier which would affect the format of the terms to be returned. The term given in the clause is the position within the ordered list of terms at which to start, however see the responsePosition parameter below for more information. If the empty term is given, then even if searching for it is unsupported by the server, then it may be interpreted as the beginning of the term list.
This is the number of terms which the client requests be returned. The actual number returned may be less than this, for example if the end of the term list is reached, but may not be more. The explain record for the database may include the maximum number of terms which the server will return at once. A request for more than this number will generate a diagnostic.
This is the position within the list of terms returned where the client would like the start term to occur. If the position given is 0, then the term should be immediately before the first term in the response. If the position given is 1, then the term should be first in the list, and so forth up to the number of terms requested plus 1, meaning that the term should be immediately after the last term in the response, even if the number of terms returned is less than the number requested.
The scan response includes the version, diagnostics and extraResponseData fields which are documented elsewhere, but maintain the same semantics as for the other operations in SRW.
The terms parameter contains a list of term elements, each representing a single term within the index. A term has the following fields:
| Name | Type | Required | Description |
|---|---|---|---|
| value | xsd:string | Mandatory | The term, exactly as it appears in the index. |
| numberOfRecords | xsd:integer | Optional | The number of records which would be matched if the index was searched with the term. |
| displayTerm | xsd:string | Optional | A string to display to the end user in place of the term itself. For example this might add back in stopwords which do not appear in the index. |
| extraTermData | xml | Optional | Additional profile specific information concerning the term |
This parameter contains the request which the client sent. It is recommended for use with SRU, but may also be returned in SRW. It is useful when constructing a thin client which does not maintain its own internal state, such that it can reconstruct the query again.