capreolus.searcher

Package Contents

Classes

Searcher()
AnseriniSearcherMixIn() MixIn for searchers that use Anserini’s SearchCollection script
PostprocessMixin()
BM25() BM25 with fixed k1 and b.
BM25Grid() BM25 with a grid search for k1 and b. Search is from 0.1 to bmax/k1max in 0.1 increments
BM25RM3() MixIn for searchers that use Anserini’s SearchCollection script
BM25PostProcess() BM25 with fixed k1 and b.
StaticBM25RM3Rob04Yang19() Tuned BM25+RM3 run used by Yang et al. in [1]. This should be used only with a benchmark using the same folds and queries.
BM25PRF() BM25 with PRF
AxiomaticSemanticMatching() TODO: Add more info on retrieval method
DirichletQL() Dirichlet QL with a fixed mu
QLJM() QL with Jelinek-Mercer smoothing
INL2() I(n)L2 scoring model
SPL() SPL scoring model
F2Exp() F2Exp scoring model
F2Log() F2Log scoring model
SDM() Sequential Dependency Model

Functions

list2str(l)
capreolus.searcher.logger[source]
capreolus.searcher.MAX_THREADS[source]
capreolus.searcher.list2str(l)[source]
class capreolus.searcher.Searcher[source]

Bases: profane.ModuleBase

module_type = searcher[source]
static load_trec_run(fn)[source]
static write_trec_run(preds, outfn)[source]
class capreolus.searcher.AnseriniSearcherMixIn[source]

MixIn for searchers that use Anserini’s SearchCollection script

class capreolus.searcher.PostprocessMixin[source]
filter(self, run_dir, docs_to_remove=None, docs_to_keep=None, topn=None)[source]
dedup(self, run_dir, topn=None)[source]
class capreolus.searcher.BM25[source]

Bases: capreolus.searcher.Searcher, capreolus.searcher.AnseriniSearcherMixIn

BM25 with fixed k1 and b.

module_name = BM25[source]
dependencies[source]
config_spec[source]
query_from_file(self, topicsfn, output_path)[source]

Runs BM25 search. Takes a query from the topic files, and fires it against the index :param topicsfn: Path to a topics file :param output_path: Path where the results of the search (i.e the run file) should be stored

Returns: Path to the run file where the results of the search are stored

query(self, query)[source]
class capreolus.searcher.BM25Grid[source]

Bases: capreolus.searcher.Searcher, capreolus.searcher.AnseriniSearcherMixIn

BM25 with a grid search for k1 and b. Search is from 0.1 to bmax/k1max in 0.1 increments

module_name = BM25Grid[source]
dependencies[source]
config_spec[source]
query_from_file(self, topicsfn, output_path)[source]
query(self, query, b, k1)[source]
class capreolus.searcher.BM25RM3[source]

Bases: capreolus.searcher.Searcher, capreolus.searcher.AnseriniSearcherMixIn

MixIn for searchers that use Anserini’s SearchCollection script

module_name = BM25RM3[source]
dependencies[source]
config_spec[source]
query_from_file(self, topicsfn, output_path)[source]
query(self, query, b, k1, fbterms, fbdocs, ow)[source]
class capreolus.searcher.BM25PostProcess[source]

Bases: capreolus.searcher.BM25, capreolus.searcher.PostprocessMixin

BM25 with fixed k1 and b.

module_name = BM25Postprocess[source]
config_spec[source]
query_from_file(self, topicsfn, output_path, docs_to_remove=None)[source]

Runs BM25 search. Takes a query from the topic files, and fires it against the index :param topicsfn: Path to a topics file :param output_path: Path where the results of the search (i.e the run file) should be stored

Returns: Path to the run file where the results of the search are stored

class capreolus.searcher.StaticBM25RM3Rob04Yang19[source]

Bases: capreolus.searcher.Searcher

Tuned BM25+RM3 run used by Yang et al. in [1]. This should be used only with a benchmark using the same folds and queries.

[1] Wei Yang, Kuang Lu, Peilin Yang, and Jimmy Lin. Critically Examining the “Neural Hype”: Weak Baselines and the Additivity of Effectiveness Gains from Neural Ranking Models. SIGIR 2019.

module_name = bm25staticrob04yang19[source]
query_from_file(self, topicsfn, output_path)[source]
query(self, *args, **kwargs)[source]
class capreolus.searcher.BM25PRF[source]

Bases: capreolus.searcher.Searcher, capreolus.searcher.AnseriniSearcherMixIn

BM25 with PRF

module_name = BM25PRF[source]
dependencies[source]
config_spec[source]
static list2str(l)[source]
query_from_file(self, topicsfn, output_path)[source]
class capreolus.searcher.AxiomaticSemanticMatching[source]

Bases: capreolus.searcher.Searcher, capreolus.searcher.AnseriniSearcherMixIn

TODO: Add more info on retrieval method Also, BM25 is hard-coded to be the scoring model

module_name = axiomatic[source]
dependencies[source]
config_spec[source]
query_from_file(self, topicsfn, output_path)[source]
class capreolus.searcher.DirichletQL[source]

Bases: capreolus.searcher.Searcher, capreolus.searcher.AnseriniSearcherMixIn

Dirichlet QL with a fixed mu

module_name = DirichletQL[source]
dependencies[source]
config_spec[source]
query_from_file(self, topicsfn, output_path)[source]

Runs Dirichlet QL search. Takes a query from the topic files, and fires it against the index :param topicsfn: Path to a topics file :param output_path: Path where the results of the search (i.e the run file) should be stored

Returns: Path to the run file where the results of the search are stored

query(self, query)[source]
class capreolus.searcher.QLJM[source]

Bases: capreolus.searcher.Searcher, capreolus.searcher.AnseriniSearcherMixIn

QL with Jelinek-Mercer smoothing

module_name = QLJM[source]
dependencies[source]
config_spec[source]
query_from_file(self, topicsfn, output_path)[source]
class capreolus.searcher.INL2[source]

Bases: capreolus.searcher.Searcher, capreolus.searcher.AnseriniSearcherMixIn

I(n)L2 scoring model

module_name = INL2[source]
dependencies[source]
config_spec[source]
query_from_file(self, topicsfn, output_path)[source]
class capreolus.searcher.SPL[source]

Bases: capreolus.searcher.Searcher, capreolus.searcher.AnseriniSearcherMixIn

SPL scoring model

module_name = SPL[source]
dependencies[source]
config_spec[source]
query_from_file(self, topicsfn, output_path)[source]
class capreolus.searcher.F2Exp[source]

Bases: capreolus.searcher.Searcher, capreolus.searcher.AnseriniSearcherMixIn

F2Exp scoring model

module_name = F2Exp[source]
dependencies[source]
config_spec[source]
query_from_file(self, topicsfn, output_path)[source]
class capreolus.searcher.F2Log[source]

Bases: capreolus.searcher.Searcher, capreolus.searcher.AnseriniSearcherMixIn

F2Log scoring model

module_name = F2Log[source]
dependencies[source]
config_spec[source]
query_from_file(self, topicsfn, output_path)[source]
class capreolus.searcher.SDM[source]

Bases: capreolus.searcher.Searcher, capreolus.searcher.AnseriniSearcherMixIn

Sequential Dependency Model The scoring model is hardcoded to be BM25 (TODO: Make it configurable?)

module_name = SDM[source]
dependencies[source]
config_spec[source]
query_from_file(self, topicsfn, output_path)[source]