capreolus.benchmark

Submodules

Package Contents

Classes

Benchmark

Base class for Benchmark modules. The purpose of a Benchmark is to provide the data needed to run an experiment, such as queries, folds, and relevance judgments.

IRDBenchmark

Base class for Benchmark modules. The purpose of a Benchmark is to provide the data needed to run an experiment, such as queries, folds, and relevance judgments.

class capreolus.benchmark.Benchmark(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: capreolus.ModuleBase

Base class for Benchmark modules. The purpose of a Benchmark is to provide the data needed to run an experiment, such as queries, folds, and relevance judgments.

Modules should provide:
  • a topics dict mapping query ids (qids) to queries

  • a qrels dict mapping qids to docids and relevance labels

  • a folds dict mapping a fold name to training, dev (validation), and testing qids

  • if these can be loaded from files in standard formats, they can be specified by setting the topic_file, qrel_file, and fold_file, respectively, rather than by setting the above attributes directly

module_type = benchmark[source]
qrel_file[source]
topic_file[source]
fold_file[source]
query_type[source]
relevance_level = 1[source]

Documents with a relevance label >= relevance_level will be considered relevant. This corresponds to trec_eval’s –level_for_rel (and is passed to pytrec_eval as relevance_level).

property qrels(self)[source]
property topics(self)[source]
property folds(self)[source]
get_topics_file(self, query_sets=None)[source]

Returns path to a topics file in TSV format containing queries from query_sets. query_sets may contain any combination of ‘train’, ‘dev’, and ‘test’. All are returned if query_sets is None.

class capreolus.benchmark.IRDBenchmark(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: Benchmark

Base class for Benchmark modules. The purpose of a Benchmark is to provide the data needed to run an experiment, such as queries, folds, and relevance judgments.

Modules should provide:
  • a topics dict mapping query ids (qids) to queries

  • a qrels dict mapping qids to docids and relevance labels

  • a folds dict mapping a fold name to training, dev (validation), and testing qids

  • if these can be loaded from files in standard formats, they can be specified by setting the topic_file, qrel_file, and fold_file, respectively, rather than by setting the above attributes directly

ird_dataset_names = [][source]
property qrels(self)[source]
property topics(self)[source]
ird_load_qrels(self)[source]
ird_load_topics(self)[source]