capreolus.searcher

Package Contents

Classes

Searcher(config=None, provide=None, share_dependency_objects=False, build=True) Base class for profane modules.
AnseriniSearcherMixIn() MixIn for searchers that use Anserini’s SearchCollection script
PostprocessMixin()
BM25(config=None, provide=None, share_dependency_objects=False, build=True) BM25 with fixed k1 and b.
BM25Grid(config=None, provide=None, share_dependency_objects=False, build=True) BM25 with a grid search for k1 and b. Search is from 0.1 to bmax/k1max in 0.1 increments
BM25RM3(config=None, provide=None, share_dependency_objects=False, build=True) Base class for profane modules.
BM25PostProcess(config=None, provide=None, share_dependency_objects=False, build=True) BM25 with fixed k1 and b.
StaticBM25RM3Rob04Yang19(config=None, provide=None, share_dependency_objects=False, build=True) Tuned BM25+RM3 run used by Yang et al. in [1]. This should be used only with a benchmark using the same folds and queries.
BM25PRF(config=None, provide=None, share_dependency_objects=False, build=True) BM25 with PRF
AxiomaticSemanticMatching(config=None, provide=None, share_dependency_objects=False, build=True) TODO: Add more info on retrieval method
DirichletQL(config=None, provide=None, share_dependency_objects=False, build=True) Dirichlet QL with a fixed mu
QLJM(config=None, provide=None, share_dependency_objects=False, build=True) QL with Jelinek-Mercer smoothing
INL2(config=None, provide=None, share_dependency_objects=False, build=True) I(n)L2 scoring model
SPL(config=None, provide=None, share_dependency_objects=False, build=True) SPL scoring model
F2Exp(config=None, provide=None, share_dependency_objects=False, build=True) F2Exp scoring model
F2Log(config=None, provide=None, share_dependency_objects=False, build=True) F2Log scoring model
SDM(config=None, provide=None, share_dependency_objects=False, build=True) Sequential Dependency Model

Functions

list2str(l, delimiter=’-‘)
capreolus.searcher.logger[source]
capreolus.searcher.MAX_THREADS[source]
capreolus.searcher.list2str(l, delimiter='-')[source]
class capreolus.searcher.Searcher(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: profane.ModuleBase

Base class for profane modules. Module construction proceeds as follows: 1) Any config options not present in config are filled in with their default values. Config options and their defaults are specified in the config_spec class attribute. 2) Any dependencies declared in the dependencies class attribute are recursively instantiated. If the dependency object is present in provide, this object will be used instead of instantiating a new object for the dependency. 3) The module object’s config variable is updated to reflect the configs of its dependencies and then frozen.

After construction is complete, the module’s dependencies are available as instance variables: self.`dependency key`.

Parameters:
  • config – dictionary containing a config to apply to this module and its dependencies
  • provide – dictionary mapping dependency keys to module objects
  • share_dependency_objects – if true, dependencies will be cached in the registry based on their configs and reused. See the share_objects argument of ModuleBase.create.
module_type = searcher[source]
static load_trec_run(fn)[source]
static write_trec_run(preds, outfn)[source]
query_from_file(self, topicsfn, output_path)[source]
query(self, query, **kwargs)[source]

search document based on given query, using parameters in config as default

class capreolus.searcher.AnseriniSearcherMixIn[source]

MixIn for searchers that use Anserini’s SearchCollection script

class capreolus.searcher.PostprocessMixin[source]
filter(self, run_dir, docs_to_remove=None, docs_to_keep=None, topn=None)[source]
dedup(self, run_dir, topn=None)[source]
class capreolus.searcher.BM25(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: capreolus.searcher.Searcher, capreolus.searcher.AnseriniSearcherMixIn

BM25 with fixed k1 and b.

module_name = BM25[source]
dependencies[source]
config_spec[source]
class capreolus.searcher.BM25Grid(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: capreolus.searcher.Searcher, capreolus.searcher.AnseriniSearcherMixIn

BM25 with a grid search for k1 and b. Search is from 0.1 to bmax/k1max in 0.1 increments

module_name = BM25Grid[source]
dependencies[source]
config_spec[source]
class capreolus.searcher.BM25RM3(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: capreolus.searcher.Searcher, capreolus.searcher.AnseriniSearcherMixIn

Base class for profane modules. Module construction proceeds as follows: 1) Any config options not present in config are filled in with their default values. Config options and their defaults are specified in the config_spec class attribute. 2) Any dependencies declared in the dependencies class attribute are recursively instantiated. If the dependency object is present in provide, this object will be used instead of instantiating a new object for the dependency. 3) The module object’s config variable is updated to reflect the configs of its dependencies and then frozen.

After construction is complete, the module’s dependencies are available as instance variables: self.`dependency key`.

Parameters:
  • config – dictionary containing a config to apply to this module and its dependencies
  • provide – dictionary mapping dependency keys to module objects
  • share_dependency_objects – if true, dependencies will be cached in the registry based on their configs and reused. See the share_objects argument of ModuleBase.create.
module_name = BM25RM3[source]
dependencies[source]
config_spec[source]
class capreolus.searcher.BM25PostProcess(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: capreolus.searcher.BM25, capreolus.searcher.PostprocessMixin

BM25 with fixed k1 and b.

module_name = BM25Postprocess[source]
config_spec[source]
query_from_file(self, topicsfn, output_path, docs_to_remove=None)[source]
class capreolus.searcher.StaticBM25RM3Rob04Yang19(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: capreolus.searcher.Searcher

Tuned BM25+RM3 run used by Yang et al. in [1]. This should be used only with a benchmark using the same folds and queries.

[1] Wei Yang, Kuang Lu, Peilin Yang, and Jimmy Lin. Critically Examining the “Neural Hype”: Weak Baselines and the Additivity of Effectiveness Gains from Neural Ranking Models. SIGIR 2019.

module_name = bm25staticrob04yang19[source]
query(self, *args, **kwargs)[source]

search document based on given query, using parameters in config as default

class capreolus.searcher.BM25PRF(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: capreolus.searcher.Searcher, capreolus.searcher.AnseriniSearcherMixIn

BM25 with PRF

module_name = BM25PRF[source]
dependencies[source]
config_spec[source]
class capreolus.searcher.AxiomaticSemanticMatching(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: capreolus.searcher.Searcher, capreolus.searcher.AnseriniSearcherMixIn

TODO: Add more info on retrieval method Also, BM25 is hard-coded to be the scoring model

module_name = axiomatic[source]
dependencies[source]
config_spec[source]
class capreolus.searcher.DirichletQL(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: capreolus.searcher.Searcher, capreolus.searcher.AnseriniSearcherMixIn

Dirichlet QL with a fixed mu

module_name = DirichletQL[source]
dependencies[source]
config_spec[source]
class capreolus.searcher.QLJM(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: capreolus.searcher.Searcher, capreolus.searcher.AnseriniSearcherMixIn

QL with Jelinek-Mercer smoothing

module_name = QLJM[source]
dependencies[source]
config_spec[source]
class capreolus.searcher.INL2(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: capreolus.searcher.Searcher, capreolus.searcher.AnseriniSearcherMixIn

I(n)L2 scoring model

module_name = INL2[source]
dependencies[source]
config_spec[source]
class capreolus.searcher.SPL(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: capreolus.searcher.Searcher, capreolus.searcher.AnseriniSearcherMixIn

SPL scoring model

module_name = SPL[source]
dependencies[source]
config_spec[source]
class capreolus.searcher.F2Exp(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: capreolus.searcher.Searcher, capreolus.searcher.AnseriniSearcherMixIn

F2Exp scoring model

module_name = F2Exp[source]
dependencies[source]
config_spec[source]
class capreolus.searcher.F2Log(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: capreolus.searcher.Searcher, capreolus.searcher.AnseriniSearcherMixIn

F2Log scoring model

module_name = F2Log[source]
dependencies[source]
config_spec[source]
class capreolus.searcher.SDM(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: capreolus.searcher.Searcher, capreolus.searcher.AnseriniSearcherMixIn

Sequential Dependency Model The scoring model is hardcoded to be BM25 (TODO: Make it configurable?)

module_name = SDM[source]
dependencies[source]
config_spec[source]