Record XPaths

Version 1.1, 12th January 2004

Introduction

SRW can be used to retrieve any sort of XML records, and is not limited by size or complexity in any way. Records in schemas such as TEI, EAD, SVG, X3D or OpenOffice's schema can be extremely long and complex.

Clients may wish to request only very specific sections of a record which are wanted for display, rather than the entire thing, for example to create a title list display. Even simple Dublin Core records may be considerably longer than the client requires, and to request the entire thing just to throw most of it away seems wasteful of resources. Equally, a client may want to be able to page through a single record which is too long to display easily, but lacks the capabilities to do this segmentation itself. Another usage scenario would be if the client has some prior knowledge of the records and is only interested in certain sections, for example the trumpet section of an XML encoded musical score.

In order to enable clients to request only the parts of the record that it is immediately interested in, it may supply an XPath expression to be evaluated.

Semantics

The recordXPath field in the searchRetrieveRequest has two sections, but is model as a single string for simplicity. The first is a mandatory XPath expression. The second is an optional identifier for a record schema. If not supplied then the server will supply a default. The recordSchema field of the request should contain the identifier for a version of the XPath result schema in which the results should be encoded.

The xpath expression is considered, unless otherwise namespaced, to be relative to the record schema. For example, if simple Dublin Core is given as the record schema, then the path '/dc/title/' is valid, but '/ead/eadheader/titlestmt/titleproper' is not, even if the record could be returned in EAD.

The '|' character may be used to separate multiple paths, as used in XSLT patterns. Equally, the xpath expression may match multiple nodes within the record. In either scenario, the response may contain multiple nodes within the nodeSet element in the response schema.

If the path supplied cannot be evaluated because it is invalid then the server may respond with a single fatal diagnostic. On the other hand, if the expression is valid but cannot be evaluated for a particular record, the server should respond with a surrogate diagnostic in the correct place within the result set. If the expression matches no nodes within the record, then an empty nodeSet may be returned rather than a diagnostic.

Servers may, at their discretion, refuse to process XPaths with any particular feature. For example, servers may refuse to process any XPath function, but accept regular paths of elements. Clients should be prepared for such a refusal via a diagnostic. Servers may also provide additional XPath functions which are not part of the standard library, for example a function which returns the element which matched the search query.

Please see the XPath result schema documentation for examples of usage.