Version 1.0, 4th June 2004
The default ordering of a result set is left up to the server, including a lack of any explicit ordering. This is addressed in SRW for the most part through the use of the sortKeys parameter, however, for sophisticated relevance based ranking, boolean operands might be treated differently, and specific methods might be requested to combine the results of evaluating each operand. This context set attempts to address this issue by defining relation and boolean modifiers for the various known algorithms. The algorithms have their documentation linked in the tables below.
If you wish to have an algorithm added to this set, please contact the maintainer. If you wish to use another algorithm without having it added, then you should create a new context set, but please reference this base set to avoid duplication.
If the 'relevant' relation modifier from the cql context set is given, but no named algorithm, then the server should continue to use the basic semantics -- the server may decide which algorithm to use. It is also legal to include both cql.relevant along with an algorithm from this set, in which case that algorithm should be used. Hence there is no need to include an 'any algorithm' relation modifier in this set.
Also, please note that, as with all context sets, these modifiers are case insensitive. "rel.CORI" and "rel.cori" are to be treated the same. This is especially true as most of the modifiers are acronyms so may be entered in upper case into queries, even though they are listed in lower case below.
To return relevancy information attached to a record, please see the record metadata extension. (To be written up, ala 'rec' context set)
The identifier for the context set is: info:srw/cql-context-set/2/relevance-1.0
The recommended short name is: rel
The maintainer of the context set is: Rob Sanderson, azaroth@liv.ac.uk
There are no indexes defined in this context set.
| Modifier Name | Description |
|---|---|
| lr | Logistic Regression algorithm from UC Berkeley |
| cori | CORI algorithm of Callan et al. (Carnegie Mellon) |
| okapi | OKAPI BM-25 of Robertson et al. (City University, London) |
| gloss | Glossary of Servers of Gravano et al. (Stanford) |
| ggloss | Generalised Glossary of Servers |
| dtf-cori | Decision-Theoretic Framework extension to CORI of Fuhr, Nottelmann (University of Duisburg-Essen) |
| redde | Relevant Document Distribtion Estimation of Callan et al. (Carnegie Mellon) |
| cdr | Cover Density Ranking |
| pagerank | Google's PageRank algorithm of Brin, Page (ex Stanford) |
| hilltop | The Hilltop algorithm of Bharat, Milahila (Google, University of Toronto) |
| const_* | A named constant relevant to the algorithm, eg const_k=0.7 This allows constants to be overridden for specific queries or indexes in order to either ensure consistency across servers or to fine tune the results. |
| Modifier Name | Description |
|---|---|
| sum | Add the values |
| mean | Average the values |
| nsum | Normalised the summed values |
| cmbz | Normalise and rescale values |
| max | Select maximum value |
| min | Select minimum value |
| nprv | Normalise values and privilege high ranked documents |
| pivot | Normalise sub-record retrieval scores based on document scores |
| const_* | A named constant relevant to the algorithm, as above |
Some examples of how the context set might be used.
dc.title any/rel.lr "fish squid burger cheese" cql.anywhere all/rel.cori "sanderson denenberg" or/rel.mean dc.description any/rel.cori "information retrieval" dc.title any/rel.lr/rel.const_c0=-0.705 "logistic regression relevance ranking techniques" |