`capreolus.benchmark`¶

Submodules¶

Package Contents¶

Classes¶

`Benchmark`	Base class for Benchmark modules. The purpose of a Benchmark is to provide the data needed to run an experiment, such as queries, folds, and relevance judgments.
`IRDBenchmark`	Base class for Benchmark modules. The purpose of a Benchmark is to provide the data needed to run an experiment, such as queries, folds, and relevance judgments.

Functions¶

validate(build_f)

Attributes¶

logger

capreolus.benchmark.logger[source]¶

capreolus.benchmark.validate(build_f)[source]¶

class capreolus.benchmark.Benchmark(config=None, provide=None, share_dependency_objects=False, build=True)[source]¶

Bases: capreolus.ModuleBase

Base class for Benchmark modules. The purpose of a Benchmark is to provide the data needed to run an experiment, such as queries, folds, and relevance judgments.

Modules should provide:

a topics dict mapping query ids (qids) to queries
a qrels dict mapping qids to docids and relevance labels
a folds dict mapping a fold name to training, dev (validation), and testing qids
if these can be loaded from files in standard formats, they can be specified by setting the topic_file, qrel_file, and fold_file, respectively, rather than by setting the above attributes directly

property qrels[source]¶

property topics[source]¶

property folds[source]¶

property non_nn_dev[source]¶

module_type = 'benchmark'[source]¶

qrel_file[source]¶

topic_file[source]¶

fold_file[source]¶

query_type[source]¶

relevance_level = 1[source]¶: Documents with a relevance label >= relevance_level will be considered relevant. This corresponds to trec_eval’s –level_for_rel (and is passed to pytrec_eval as relevance_level).

use_train_as_dev = True[source]¶: Whether to use training set as validate set when there is no training needed, e.g. for traditional IR algorithms like BM25

get_topics_file(query_sets=None)[source]¶: Returns path to a topics file in TSV format containing queries from query_sets. query_sets may contain any combination of ‘train’, ‘dev’, and ‘test’. All are returned if query_sets is None.

build()[source]¶

class capreolus.benchmark.IRDBenchmark(config=None, provide=None, share_dependency_objects=False, build=True)[source]¶

Bases: Benchmark