capreolus.searcher.anserini

Module Contents

Classes

AnseriniSearcherMixIn

MixIn for searchers that use Anserini's SearchCollection script

PostprocessMixin

BM25

Anserini BM25. This searcher's parameters can also be specified as lists indicating parameters to grid search (e.g., "0.4,0.6,0.8,1.0" or "0.4..1,0.2").

BM25Grid

Deprecated. BM25 with a grid search for k1 and b. Search is from 0.1 to bmax/k1max in 0.1 increments

BM25RM3

Anserini BM25 with RM3 expansion. This searcher's parameters can also be specified as lists indicating parameters to grid search (e.g., "0.4,0.6,0.8,1.0" or "0.4..1,0.2").

BM25PostProcess

Anserini BM25. This searcher's parameters can also be specified as lists indicating parameters to grid search (e.g., "0.4,0.6,0.8,1.0" or "0.4..1,0.2").

StaticRun

Base class for Searcher modules. The purpose of a Searcher is to query a collection via an Index module.

StaticBM25RM3Rob04Yang19

Tuned BM25+RM3 run used by Yang et al. in [1]. This should be used only with a benchmark using the same folds and queries.

StaticBM25RM3Rob04Yang19Desc

Tuned BM25+RM3 robust04 description run on the folds used by Yang et al. in [1]. This should be used only with a benchmark using the same folds and queries.

StaticBM25Rob04Huston14Title

Base class for Searcher modules. The purpose of a Searcher is to query a collection via an Index module.

StaticBM25Rob04Huston14Desc

Base class for Searcher modules. The purpose of a Searcher is to query a collection via an Index module.

StaticBM25Gov2

Base class for Searcher modules. The purpose of a Searcher is to query a collection via an Index module.

StaticBM25Gov2Desc

Base class for Searcher modules. The purpose of a Searcher is to query a collection via an Index module.

StaticBM25Genomics

Base class for Searcher modules. The purpose of a Searcher is to query a collection via an Index module.

StaticBM25CDS

CDS BM25 run with k1=4.0, b=0.6 and new CDS 2016 documents removed from the 2014 and 2015 queries

StaticCovidUdelAbstract

Base class for Searcher modules. The purpose of a Searcher is to query a collection via an Index module.

StaticRM3TitleCore18

Base class for Searcher modules. The purpose of a Searcher is to query a collection via an Index module.

StaticRM3DescCore18

Base class for Searcher modules. The purpose of a Searcher is to query a collection via an Index module.

BM25PRF

Anserini BM25 PRF. This searcher's parameters can also be specified as lists indicating parameters to grid search (e.g., "0.4,0.6,0.8,1.0" or "0.4..1,0.2").

AxiomaticSemanticMatching

Anserini BM25 with Axiomatic query expansion. This searcher's parameters can also be specified as lists indicating parameters to grid search (e.g., "0.4,0.6,0.8,1.0" or "0.4..1,0.2").

DirichletQL

Anserini QL with Dirichlet smoothing. This searcher's parameters can also be specified as lists indicating parameters to grid search (e.g., "0.4,0.6,0.8,1.0" or "0.4..1,0.2").

QLJM

Anserini QL with Jelinek-Mercer smoothing. This searcher's parameters can also be specified as lists indicating parameters to grid search (e.g., "0.4,0.6,0.8,1.0" or "0.4..1,0.2").

INL2

Anserini I(n)L2 scoring model. This searcher does not support list parameters.

SPL

Anserini SPL scoring model. This searcher does not support list parameters.

F2Exp

F2Exp scoring model. This searcher does not support list parameters.

F2Log

F2Log scoring model. This searcher does not support list parameters.

SDM

Anserini BM25 with the Sequential Dependency Model. This searcher supports list parameters for only k1 and b.

Functions

list2str(l[, delimiter])

Attributes

logger

MAX_THREADS

capreolus.searcher.anserini.logger[source]
capreolus.searcher.anserini.MAX_THREADS[source]
capreolus.searcher.anserini.list2str(l, delimiter='-')[source]
class capreolus.searcher.anserini.AnseriniSearcherMixIn[source]

MixIn for searchers that use Anserini’s SearchCollection script

dependencies[source]
class capreolus.searcher.anserini.PostprocessMixin[source]
filter(run_dir, docs_to_remove=None, docs_to_keep=None, topn=None)[source]
dedup(run_dir, topn=None)[source]
class capreolus.searcher.anserini.BM25(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: AnseriniSearcherMixIn, capreolus.searcher.Searcher

Anserini BM25. This searcher’s parameters can also be specified as lists indicating parameters to grid search (e.g., "0.4,0.6,0.8,1.0" or "0.4..1,0.2").

module_name = 'BM25'[source]
config_spec[source]
class capreolus.searcher.anserini.BM25Grid(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: AnseriniSearcherMixIn, capreolus.searcher.Searcher

Deprecated. BM25 with a grid search for k1 and b. Search is from 0.1 to bmax/k1max in 0.1 increments

module_name = 'BM25Grid'[source]
config_spec[source]
class capreolus.searcher.anserini.BM25RM3(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: AnseriniSearcherMixIn, capreolus.searcher.Searcher

Anserini BM25 with RM3 expansion. This searcher’s parameters can also be specified as lists indicating parameters to grid search (e.g., "0.4,0.6,0.8,1.0" or "0.4..1,0.2").

module_name = 'BM25RM3'[source]
config_spec[source]
class capreolus.searcher.anserini.BM25PostProcess(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: BM25, PostprocessMixin

Anserini BM25. This searcher’s parameters can also be specified as lists indicating parameters to grid search (e.g., "0.4,0.6,0.8,1.0" or "0.4..1,0.2").

module_name = 'BM25Postprocess'[source]
config_spec[source]
query_from_file(topicsfn, output_path, docs_to_remove=None)[source]
class capreolus.searcher.anserini.StaticRun(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: capreolus.searcher.Searcher

Base class for Searcher modules. The purpose of a Searcher is to query a collection via an Index module.

Similar to Rerankers, Searchers return a list of documents and their relevance scores for a given query. Searchers are unsupervised and efficient, whereas Rerankers are supervised and do not use an inverted index directly.

Modules should provide:
  • a query(string) and a query_from_file(path) method that return document scores

abstract query(*args, **kwargs)[source]

search document based on given query, using parameters in config as default

class capreolus.searcher.anserini.StaticBM25RM3Rob04Yang19(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: StaticRun

Tuned BM25+RM3 run used by Yang et al. in [1]. This should be used only with a benchmark using the same folds and queries.

[1] Wei Yang, Kuang Lu, Peilin Yang, and Jimmy Lin. Critically Examining the “Neural Hype”: Weak Baselines and the Additivity of Effectiveness Gains from Neural Ranking Models. SIGIR 2019.

module_name = 'bm25staticrob04yang19'[source]
run_fn = 'rob04_yang19_rm3.run'[source]
class capreolus.searcher.anserini.StaticBM25RM3Rob04Yang19Desc(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: StaticRun

Tuned BM25+RM3 robust04 description run on the folds used by Yang et al. in [1]. This should be used only with a benchmark using the same folds and queries.

[1] Wei Yang, Kuang Lu, Peilin Yang, and Jimmy Lin. Critically Examining the “Neural Hype”: Weak Baselines and the Additivity of Effectiveness Gains from Neural Ranking Models. SIGIR 2019.

module_name = 'bm25staticrob04yang19desc'[source]
run_fn = 'rob04_yang19_desc_rm3.run'[source]
class capreolus.searcher.anserini.StaticBM25Rob04Huston14Title(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: StaticRun

Base class for Searcher modules. The purpose of a Searcher is to query a collection via an Index module.

Similar to Rerankers, Searchers return a list of documents and their relevance scores for a given query. Searchers are unsupervised and efficient, whereas Rerankers are supervised and do not use an inverted index directly.

Modules should provide:
  • a query(string) and a query_from_file(path) method that return document scores

module_name = 'bm25staticrob04huston14title'[source]
run_fn = 'rob04_huston14_title_rm3.run'[source]
class capreolus.searcher.anserini.StaticBM25Rob04Huston14Desc(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: StaticRun

Base class for Searcher modules. The purpose of a Searcher is to query a collection via an Index module.

Similar to Rerankers, Searchers return a list of documents and their relevance scores for a given query. Searchers are unsupervised and efficient, whereas Rerankers are supervised and do not use an inverted index directly.

Modules should provide:
  • a query(string) and a query_from_file(path) method that return document scores

module_name = 'bm25staticrob04huston14desc'[source]
run_fn = 'rob04_huston14_desc_rm3.run'[source]
class capreolus.searcher.anserini.StaticBM25Gov2(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: StaticRun

Base class for Searcher modules. The purpose of a Searcher is to query a collection via an Index module.

Similar to Rerankers, Searchers return a list of documents and their relevance scores for a given query. Searchers are unsupervised and efficient, whereas Rerankers are supervised and do not use an inverted index directly.

Modules should provide:
  • a query(string) and a query_from_file(path) method that return document scores

module_name = 'bm25staticgov2'[source]
run_fn = 'gov2_bm25.run'[source]
class capreolus.searcher.anserini.StaticBM25Gov2Desc(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: StaticRun

Base class for Searcher modules. The purpose of a Searcher is to query a collection via an Index module.

Similar to Rerankers, Searchers return a list of documents and their relevance scores for a given query. Searchers are unsupervised and efficient, whereas Rerankers are supervised and do not use an inverted index directly.

Modules should provide:
  • a query(string) and a query_from_file(path) method that return document scores

module_name = 'bm25staticgov2desc'[source]
run_fn = 'gov2_desc_bm25.run'[source]
class capreolus.searcher.anserini.StaticBM25Genomics(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: StaticRun

Base class for Searcher modules. The purpose of a Searcher is to query a collection via an Index module.

Similar to Rerankers, Searchers return a list of documents and their relevance scores for a given query. Searchers are unsupervised and efficient, whereas Rerankers are supervised and do not use an inverted index directly.

Modules should provide:
  • a query(string) and a query_from_file(path) method that return document scores

module_name = 'bm25staticgenomics'[source]
run_fn = 'genomics_bm25.run'[source]
class capreolus.searcher.anserini.StaticBM25CDS(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: StaticRun

CDS BM25 run with k1=4.0, b=0.6 and new CDS 2016 documents removed from the 2014 and 2015 queries

module_name = 'bm25staticcds'[source]
run_fn = 'cds_bm25.run'[source]
class capreolus.searcher.anserini.StaticCovidUdelAbstract(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: StaticRun

Base class for Searcher modules. The purpose of a Searcher is to query a collection via an Index module.

Similar to Rerankers, Searchers return a list of documents and their relevance scores for a given query. Searchers are unsupervised and efficient, whereas Rerankers are supervised and do not use an inverted index directly.

Modules should provide:
  • a query(string) and a query_from_file(path) method that return document scores

module_name = 'qdelstaticcovidabstract'[source]
run_fn = 'anserini.covid-r5.abstract.qdel.bm25-top1k.txt'[source]
class capreolus.searcher.anserini.StaticRM3TitleCore18(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: StaticRun

Base class for Searcher modules. The purpose of a Searcher is to query a collection via an Index module.

Similar to Rerankers, Searchers return a list of documents and their relevance scores for a given query. Searchers are unsupervised and efficient, whereas Rerankers are supervised and do not use an inverted index directly.

Modules should provide:
  • a query(string) and a query_from_file(path) method that return document scores

module_name = 'rm3staticcore18title'[source]
run_fn = 'core18_title_rm3.run'[source]
class capreolus.searcher.anserini.StaticRM3DescCore18(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: StaticRun

Base class for Searcher modules. The purpose of a Searcher is to query a collection via an Index module.

Similar to Rerankers, Searchers return a list of documents and their relevance scores for a given query. Searchers are unsupervised and efficient, whereas Rerankers are supervised and do not use an inverted index directly.

Modules should provide:
  • a query(string) and a query_from_file(path) method that return document scores

module_name = 'rm3staticcore18desc'[source]
run_fn = 'core18_desc_rm3.run'[source]
class capreolus.searcher.anserini.BM25PRF(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: AnseriniSearcherMixIn, capreolus.searcher.Searcher

Anserini BM25 PRF. This searcher’s parameters can also be specified as lists indicating parameters to grid search (e.g., "0.4,0.6,0.8,1.0" or "0.4..1,0.2").

module_name = 'BM25PRF'[source]
config_spec[source]
class capreolus.searcher.anserini.AxiomaticSemanticMatching(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: AnseriniSearcherMixIn, capreolus.searcher.Searcher

Anserini BM25 with Axiomatic query expansion. This searcher’s parameters can also be specified as lists indicating parameters to grid search (e.g., "0.4,0.6,0.8,1.0" or "0.4..1,0.2").

module_name = 'axiomatic'[source]
config_spec[source]
class capreolus.searcher.anserini.DirichletQL(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: AnseriniSearcherMixIn, capreolus.searcher.Searcher

Anserini QL with Dirichlet smoothing. This searcher’s parameters can also be specified as lists indicating parameters to grid search (e.g., "0.4,0.6,0.8,1.0" or "0.4..1,0.2").

module_name = 'DirichletQL'[source]
config_spec[source]
class capreolus.searcher.anserini.QLJM(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: AnseriniSearcherMixIn, capreolus.searcher.Searcher

Anserini QL with Jelinek-Mercer smoothing. This searcher’s parameters can also be specified as lists indicating parameters to grid search (e.g., "0.4,0.6,0.8,1.0" or "0.4..1,0.2").

module_name = 'QLJM'[source]
config_spec[source]
class capreolus.searcher.anserini.INL2(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: AnseriniSearcherMixIn, capreolus.searcher.Searcher

Anserini I(n)L2 scoring model. This searcher does not support list parameters.

module_name = 'INL2'[source]
config_spec[source]
class capreolus.searcher.anserini.SPL(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: AnseriniSearcherMixIn, capreolus.searcher.Searcher

Anserini SPL scoring model. This searcher does not support list parameters.

module_name = 'SPL'[source]
config_spec[source]
class capreolus.searcher.anserini.F2Exp(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: AnseriniSearcherMixIn, capreolus.searcher.Searcher

F2Exp scoring model. This searcher does not support list parameters.

module_name = 'F2Exp'[source]
config_spec[source]
class capreolus.searcher.anserini.F2Log(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: AnseriniSearcherMixIn, capreolus.searcher.Searcher

F2Log scoring model. This searcher does not support list parameters.

module_name = 'F2Log'[source]
config_spec[source]
class capreolus.searcher.anserini.SDM(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: AnseriniSearcherMixIn, capreolus.searcher.Searcher

Anserini BM25 with the Sequential Dependency Model. This searcher supports list parameters for only k1 and b.

module_name = 'SDM'[source]
config_spec[source]