SRW Background
Version 1.1, 20th November 2003
Background
The SRW Initiative, building on Z39.50 along with web technologies, recognizes the importance of Z39.50 (as currently defined and deployed) for business communication, and focuses on getting information to the user. SRW provides semantics for searching databases containing metadata and objects, both text and non-text. Building on Z39.50 semantics enables the creation of gateways to existing Z39.50 systems while reducing the barriers to new information providers, to make their resources available via a standard search and retrieve service.
SRW defines a web service combining several Z39.50 features, most notably, the Search, Present, Sort and Scan Services. Additional features/services may be added later or defined later as new web services.
Z39.50 Concepts Retained in SRW
- Result Sets
- Abstract Access points
- Abstract Record schemas
- Explain
- Diagnostics
SRW Features which Differ from Z39.50
- Result Set Named by Server
In contrast to Z39.50 where the client names the result set, for SRW
the server assigns the result set id. After a server executes a query
it may include in the response a result set name. (If the server does
not intend that the result set will remain reasonably static, then it
will not supply a result set name.) The server may include a result
set idle time value indicating a projected (not guaranteed) length of
time that the result set will remain available if it is not referenced.
- Connections, Sessions, State
There is no explicit concept of connection, session, or state. Each
invocation of the Search/Retrieve service will be a request/response
sequence, via an XML/SOAP/RPC message using HTTP POST. Different invocations
will not be related to one-another (in any way that's visible to the
protocol). A result set created by one invocation may be referenced
by a subsequent invocation, however it is the responsibility of the
application (not the protocol) to ensure that the subsequent invocation
is authenticated/authorized to access the result set. (An authentication
token is introduced in the protocol which may be used for this purpose.)
- No distinction between server and database
SRW does not distinguish between a server and a database; the philosophy
is that the database distinction is just another search criterion, and
search criteria is the province of the query. It is hoped that elimination
of the database concept will effect significant simplification (since
the multiple-database concept in Z39.50 has caused such complexity),
for example Explain will be significantly simplified (and hopefully
it will therefore become more widely implemented).
- Single record syntax
All SRW records are retrieved according to a single record syntax (XML)
and therefore the Z39.50 concept of record syntax is not meaningful
in SRW. The Z39.50 concepts of element set/specification and schema
are represented by XML schemas. The following record schemas will be
distinguished in SRW: Dublin Core, Onix, MODS, and MarcXml.
- String Query
SRW specifies string queries. The query language, CQL ("Common
Query Language"), is a human-readable-string query-representation
based loosely on CCL (however just the query, no commands) with access
points defined. The CQL syntax will include the result set name, and
will support both the capability to qualify a result set (e.g. "records
in result set 'A' where title is 'B' ") and to specify only a result
set name (e.g. "records in result set 'A'") analogous to a
Z39.50 Present.
- Flat Access Points
Flat access points are defined, rather than utilizing attribute vectors
as in traditional Z39.50. For example, consider 'title - word' and 'title
- phrase'. In SRW these would be represented as distinct access points
(rather than two attribute combinations with the same Use attribute
and different qualifying attributes).
- Static Explain
Explain information will be static (not based on the Z39.50 Explain
concept of searching an Explain database for specific information),
Explain information will include supported access points and record
schemas. The Explain simplification also owes in large part to the SRW
simplification discarding multiple databases and record syntaxes, and
it is hoped that there will be more motivation to implement the SRW
version of Explain (than there was to implement the Z39.50-1995 Explain)
because of the substantial simplification.
- XML instead of ASN.1.
XML will be used for abstract syntax as well as encoding. ASN.1/BER
is not used.