Public Member Functions | |
Enquire (const Enquire &other) | |
Copying is allowed (and is cheap). | |
void | operator= (const Enquire &other) |
Assignment is allowed (and is cheap). | |
Enquire (const Database &database, ErrorHandler *errorhandler_=0) | |
Create a Xapian::Enquire object. | |
~Enquire () | |
Close the Xapian::Enquire object. | |
void | set_query (const Xapian::Query &query, Xapian::termcount qlen=0) |
Set the query to run. | |
const Xapian::Query & | get_query () const |
Get the query which has been set. | |
void | add_matchspy (MatchSpy *spy) |
Add a matchspy. | |
void | clear_matchspies () |
Remove all the matchspies. | |
void | set_weighting_scheme (const Weight &weight_) |
Set the weighting scheme to use for queries. | |
void | set_collapse_key (Xapian::valueno collapse_key, Xapian::doccount collapse_max=1) |
Set the collapse key to use for queries. | |
void | set_docid_order (docid_order order) |
Set the direction in which documents are ordered by document id in the returned MSet. | |
void | set_cutoff (Xapian::percent percent_cutoff, Xapian::weight weight_cutoff=0) |
Set the percentage and/or weight cutoffs. | |
void | set_sort_by_relevance () |
Set the sorting to be by relevance only. | |
void | set_sort_by_value (Xapian::valueno sort_key, bool reverse) |
Set the sorting to be by value only. | |
void | set_sort_by_key (Xapian::KeyMaker *sorter, bool reverse) |
Set the sorting to be by key generated from values only. | |
void | set_sort_by_value_then_relevance (Xapian::valueno sort_key, bool reverse) |
Set the sorting to be by value, then by relevance for documents with the same value. | |
void | set_sort_by_key_then_relevance (Xapian::KeyMaker *sorter, bool reverse) |
Set the sorting to be by keys generated from values, then by relevance for documents with identical keys. | |
void | set_sort_by_relevance_then_value (Xapian::valueno sort_key, bool reverse) |
Set the sorting to be by relevance then value. | |
void | set_sort_by_relevance_then_key (Xapian::KeyMaker *sorter, bool reverse) |
Set the sorting to be by relevance, then by keys generated from values. | |
MSet | get_mset (Xapian::doccount first, Xapian::doccount maxitems, Xapian::doccount checkatleast=0, const RSet *omrset=0, const MatchDecider *mdecider=0) const |
Get (a portion of) the match set for the current query. | |
ESet | get_eset (Xapian::termcount maxitems, const RSet &omrset, int flags=0, double k=1.0, const Xapian::ExpandDecider *edecider=0) const |
Get the expand set for the given rset. | |
ESet | get_eset (Xapian::termcount maxitems, const RSet &omrset, const Xapian::ExpandDecider *edecider) const |
Get the expand set for the given rset. | |
TermIterator | get_matching_terms_begin (Xapian::docid did) const |
Get terms which match a given document, by document id. | |
TermIterator | get_matching_terms_end (Xapian::docid) const |
End iterator corresponding to get_matching_terms_begin(). | |
TermIterator | get_matching_terms_begin (const MSetIterator &it) const |
Get terms which match a given document, by match set item. | |
TermIterator | get_matching_terms_end (const MSetIterator &) const |
End iterator corresponding to get_matching_terms_begin(). | |
std::string | get_description () const |
Return a string describing this object. |
Databases are usually opened lazily, so exceptions may not be thrown where you would expect them to be. You should catch Xapian::Error exceptions when calling any method in Xapian::Enquire.
Xapian::InvalidArgumentError | will be thrown if an invalid argument is supplied, for example, an unknown database type. |
Xapian::Enquire::Enquire | ( | const Database & | database, | |
ErrorHandler * | errorhandler_ = 0 | |||
) | [explicit] |
Create a Xapian::Enquire object.
This specification cannot be changed once the Xapian::Enquire is opened: you must create a new Xapian::Enquire object to access a different database, or set of databases.
The database supplied must have been initialised (ie, must not be the result of calling the Database::Database() constructor). If you need to handle a situation where you have no index gracefully, a database created with InMemory::open() can be passed here, which represents a completely empty database.
database | Specification of the database or databases to use. | |
errorhandler_ | A pointer to the error handler to use. Ownership of the object pointed to is not assumed by the Xapian::Enquire object - the user should delete the Xapian::ErrorHandler object after the Xapian::Enquire object is deleted. To use no error handler, this parameter should be 0. |
Xapian::InvalidArgumentError | will be thrown if an initialised Database object is supplied. |
void Xapian::Enquire::add_matchspy | ( | MatchSpy * | spy | ) |
Add a matchspy.
This matchspy will be called with some of the documents which match the query, during the match process. Exactly which of the matching documents are passed to it depends on exactly when certain optimisations occur during the match process, but it can be controlled to some extent by setting the checkatleast parameter to get_mset().
In particular, if there are enough matching documents, at least the number specified by checkatleast will be passed to the matchspy. This means that you can force the matchspy to be shown all matching documents by setting checkatleast to the number of documents in the database.
spy | The MatchSpy subclass to add. The caller must ensure that this remains valid while the Enquire object remains active, or until clear_matchspies() is called. |
ESet Xapian::Enquire::get_eset | ( | Xapian::termcount | maxitems, | |
const RSet & | omrset, | |||
const Xapian::ExpandDecider * | edecider | |||
) | const [inline] |
Get the expand set for the given rset.
maxitems | the maximum number of items to return. | |
omrset | the relevance set to use when performing the expand operation. | |
edecider | a decision functor to use to decide whether a given term should be put in the ESet |
Xapian::InvalidArgumentError | See class documentation. |
ESet Xapian::Enquire::get_eset | ( | Xapian::termcount | maxitems, | |
const RSet & | omrset, | |||
int | flags = 0 , |
|||
double | k = 1.0 , |
|||
const Xapian::ExpandDecider * | edecider = 0 | |||
) | const |
Get the expand set for the given rset.
maxitems | the maximum number of items to return. | |
omrset | the relevance set to use when performing the expand operation. | |
flags | zero or more of these values |-ed together:
| |
k | the parameter k in the query expansion algorithm (default is 1.0) | |
edecider | a decision functor to use to decide whether a given term should be put in the ESet |
Xapian::InvalidArgumentError | See class documentation. |
TermIterator Xapian::Enquire::get_matching_terms_begin | ( | const MSetIterator & | it | ) | const |
Get terms which match a given document, by match set item.
This method returns the terms in the current query which match the given document.
If the underlying database has suitable support, using this call (rather than passing a Xapian::docid) will enable the system to ensure that the correct data is returned, and that the document has not been deleted or changed since the query was performed.
it | The iterator for which to retrieve the matching terms. |
Xapian::InvalidArgumentError | See class documentation. | |
Xapian::DocNotFoundError | The document specified could not be found in the database. |
TermIterator Xapian::Enquire::get_matching_terms_begin | ( | Xapian::docid | did | ) | const |
Get terms which match a given document, by document id.
This method returns the terms in the current query which match the given document.
It is possible for the document to have been removed from the database between the time it is returned in an MSet, and the time that this call is made. If possible, you should specify an MSetIterator instead of a Xapian::docid, since this will enable database backends with suitable support to prevent this occurring.
Note that a query does not need to have been run in order to make this call.
did | The document id for which to retrieve the matching terms. |
Xapian::InvalidArgumentError | See class documentation. | |
Xapian::DocNotFoundError | The document specified could not be found in the database. |
MSet Xapian::Enquire::get_mset | ( | Xapian::doccount | first, | |
Xapian::doccount | maxitems, | |||
Xapian::doccount | checkatleast = 0 , |
|||
const RSet * | omrset = 0 , |
|||
const MatchDecider * | mdecider = 0 | |||
) | const |
Get (a portion of) the match set for the current query.
first | the first item in the result set to return. A value of zero corresponds to the first item returned being that with the highest score. A value of 10 corresponds to the first 10 items being ignored, and the returned items starting at the eleventh. | |
maxitems | the maximum number of items to return. If you want all matches, then you can pass the result of calling get_doccount() on the Database object (though if you are doing this so you can filter results, you are likely to get much better performance by using Xapian's match-time filtering features instead). You can pass 0 for maxitems which will give you an empty MSet with valid statistics (such as get_matches_estimated()) calculated without looking at any postings, which is very quick, but means the estimates may be more approximate and the bounds may be much looser. | |
checkatleast | the minimum number of items to check. Because the matcher optimises, it won't consider every document which might match, so the total number of matches is estimated. Setting checkatleast forces it to consider at least this many matches and so allows for reliable paging links. | |
omrset | the relevance set to use when performing the query. | |
mdecider | a decision functor to use to decide whether a given document should be put in the MSet. | |
matchspy | a decision functor to use to decide whether a given document should be put in the MSet. The matchspy is applied to every document which is a potential candidate for the MSet, so if there are checkatleast or more such documents, the matchspy will see at least checkatleast. The mdecider is assumed to be a relatively expensive test so may be applied in a lazier fashion. |
Xapian::InvalidArgumentError | See class documentation. |
const Xapian::Query& Xapian::Enquire::get_query | ( | ) | const |
Get the query which has been set.
This is only valid after set_query() has been called.
Xapian::InvalidArgumentError | will be thrown if query has not yet been set. |
void Xapian::Enquire::set_collapse_key | ( | Xapian::valueno | collapse_key, | |
Xapian::doccount | collapse_max = 1 | |||
) |
Set the collapse key to use for queries.
collapse_key | value number to collapse on - at most one MSet entry with each particular value will be returned (default is Xapian::BAD_VALUENO which means no collapsing). | |
collapse_max | Max number of items with the same key to leave after collapsing (default 1). |
An example use might be to create a value for each document containing an MD5 hash of the document contents. Then duplicate documents from different sources can be eliminated at search time by collapsing with collapse_max = 1 (it's better to eliminate duplicates at index time, but this may not be always be possible - for example the search may be over more than one Xapian database).
Another use is to group matches in a particular category (e.g. you might collapse a mailing list search on the Subject: so that there's only one result per discussion thread). In this case you can use get_collapse_count() to give the user some idea how many other results there are. And if you index the Subject: as a boolean term as well as putting it in a value, you can offer a link to a non-collapsed search restricted to that thread using a boolean filter.
void Xapian::Enquire::set_cutoff | ( | Xapian::percent | percent_cutoff, | |
Xapian::weight | weight_cutoff = 0 | |||
) |
Set the percentage and/or weight cutoffs.
percent_cutoff | Minimum percentage score for returned documents. If a document has a lower percentage score than this, it will not appear in the MSet. If your intention is to return only matches which contain all the terms in the query, then it's more efficient to use Xapian::Query::OP_AND instead of Xapian::Query::OP_OR in the query than to use set_cutoff(100). (default 0 => no percentage cut-off). | |
weight_cutoff | Minimum weight for a document to be returned. If a document has a lower score that this, it will not appear in the MSet. It is usually only possible to choose an appropriate weight for cutoff based on the results of a previous run of the same query; this is thus mainly useful for alerting operations. The other potential use is with a user specified weighting scheme. (default 0 => no weight cut-off). |
void Xapian::Enquire::set_docid_order | ( | docid_order | order | ) |
Set the direction in which documents are ordered by document id in the returned MSet.
This order only has an effect on documents which would otherwise have equal rank. For a weighted probabilistic match with no sort value, this means documents with equal weight. For a boolean match, with no sort value, this means all documents. And if a sort value is used, this means documents with equal sort value (and also equal weight if ordering on relevance after the sort).
order | This can be:
|
void Xapian::Enquire::set_query | ( | const Xapian::Query & | query, | |
Xapian::termcount | qlen = 0 | |||
) |
Set the query to run.
query | the new query to run. | |
qlen | the query length to use in weight calculations - by default the sum of the wqf of all terms is used. |
void Xapian::Enquire::set_sort_by_key | ( | Xapian::KeyMaker * | sorter, | |
bool | reverse | |||
) |
Set the sorting to be by key generated from values only.
sorter | The functor to use for generating keys. | |
reverse | If true, reverses the sort order. |
void Xapian::Enquire::set_sort_by_key_then_relevance | ( | Xapian::KeyMaker * | sorter, | |
bool | reverse | |||
) |
Set the sorting to be by keys generated from values, then by relevance for documents with identical keys.
sorter | The functor to use for generating keys. | |
reverse | If true, reverses the sort order. |
void Xapian::Enquire::set_sort_by_relevance | ( | ) |
Set the sorting to be by relevance only.
This is the default.
void Xapian::Enquire::set_sort_by_relevance_then_key | ( | Xapian::KeyMaker * | sorter, | |
bool | reverse | |||
) |
Set the sorting to be by relevance, then by keys generated from values.
Note that with the default BM25 weighting scheme parameters, non-identical documents will rarely have the same weight, so this setting will give very similar results to set_sort_by_relevance(). It becomes more useful with particular BM25 parameter settings (e.g. BM25Weight(1,0,1,0,0)) or custom weighting schemes.
sorter | The functor to use for generating keys. | |
reverse | If true, reverses the sort order. |
void Xapian::Enquire::set_sort_by_relevance_then_value | ( | Xapian::valueno | sort_key, | |
bool | reverse | |||
) |
Set the sorting to be by relevance then value.
Note that sorting by values uses a string comparison, so to use this to sort by a numeric value you'll need to store the numeric values in a manner which sorts appropriately. For example, you could use Xapian::sortable_serialise() (which works for floating point numbers as well as integers), or store numbers padded with leading zeros or spaces, or with the number of digits prepended.
Note that with the default BM25 weighting scheme parameters, non-identical documents will rarely have the same weight, so this setting will give very similar results to set_sort_by_relevance(). It becomes more useful with particular BM25 parameter settings (e.g. BM25Weight(1,0,1,0,0)) or custom weighting schemes.
sort_key | value number to sort on. | |
reverse | If true, reverses the sort order. |
void Xapian::Enquire::set_sort_by_value | ( | Xapian::valueno | sort_key, | |
bool | reverse | |||
) |
Set the sorting to be by value only.
Note that sorting by values uses a string comparison, so to use this to sort by a numeric value you'll need to store the numeric values in a manner which sorts appropriately. For example, you could use Xapian::sortable_serialise() (which works for floating point numbers as well as integers), or store numbers padded with leading zeros or spaces, or with the number of digits prepended.
sort_key | value number to sort on. | |
reverse | If true, reverses the sort order. |
void Xapian::Enquire::set_sort_by_value_then_relevance | ( | Xapian::valueno | sort_key, | |
bool | reverse | |||
) |
Set the sorting to be by value, then by relevance for documents with the same value.
Note that sorting by values uses a string comparison, so to use this to sort by a numeric value you'll need to store the numeric values in a manner which sorts appropriately. For example, you could use Xapian::sortable_serialise() (which works for floating point numbers as well as integers), or store numbers padded with leading zeros or spaces, or with the number of digits prepended.
sort_key | value number to sort on. | |
reverse | If true, reverses the sort order. |
void Xapian::Enquire::set_weighting_scheme | ( | const Weight & | weight_ | ) |
Set the weighting scheme to use for queries.
weight_ | the new weighting scheme. If no weighting scheme is specified, the default is BM25 with the default parameters. |