capreolus.benchmark.covid
¶
Module Contents¶
Classes¶
Ongoing TREC-COVID bechmark from https://ir.nist.gov/covidSubmit that uses documents from CORD, the COVID-19 Open Research Dataset (https://www.semanticscholar.org/cord19). |
|
Base class for Benchmark modules. The purpose of a Benchmark is to provide the data needed to run an experiment, such as queries, folds, and relevance judgments. |
Attributes¶
- class capreolus.benchmark.covid.COVID(config=None, provide=None, share_dependency_objects=False, build=True)[source]¶
Bases:
capreolus.benchmark.Benchmark
Ongoing TREC-COVID bechmark from https://ir.nist.gov/covidSubmit that uses documents from CORD, the COVID-19 Open Research Dataset (https://www.semanticscholar.org/cord19).
- prep_backward_compatible_qrels(tmp_dir, prev_qrels_fn, tgt_qrel_fn)[source]¶
- Prepare qrels file for round 3 adaptable to previous rounds:
convert the new docids in qrels-covid_d3_j0.5-3.txt back to its old id remove judgement existed in round1 and round2
Warning: this function should not be used when search / training is done on collection released since round 4, where docids are already updated
- Parameters
tmp_dir – pathlib.Path object, sthe directory to store downloaded files
prev_qrels_fn – qrels file which store the qrels from previous rounds (round 1 and round 2)
tgt_qrel_fn – qrels file path where to store the processed round 3 qrels file
- class capreolus.benchmark.covid.CovidQA(config=None, provide=None, share_dependency_objects=False, build=True)[source]¶
Bases:
capreolus.benchmark.Benchmark
Base class for Benchmark modules. The purpose of a Benchmark is to provide the data needed to run an experiment, such as queries, folds, and relevance judgments.
- Modules should provide:
a
topics
dict mapping query ids (qids) to queriesa
qrels
dict mapping qids to docids and relevance labelsa
folds
dict mapping a fold name to training, dev (validation), and testing qidsif these can be loaded from files in standard formats, they can be specified by setting the
topic_file
,qrel_file
, andfold_file
, respectively, rather than by setting the above attributes directly