capreolus.collection.robust04

Module Contents

Classes

Robust04

TREC Robust04 (TREC disks 4 and 5 without the Congressional Record documents)

Attributes

logger

PACKAGE_PATH

capreolus.collection.robust04.logger[source]
capreolus.collection.robust04.PACKAGE_PATH[source]
class capreolus.collection.robust04.Robust04(config=None, provide=None, share_dependency_objects=False, build=True)[source]

Bases: capreolus.collection.Collection

TREC Robust04 (TREC disks 4 and 5 without the Congressional Record documents)

module_name = robust04[source]
collection_type = TrecCollection[source]
generator_type = DefaultLuceneDocumentGenerator[source]
config_keys_not_in_path = ['path'][source]
config_spec[source]
download_if_missing(self)[source]

Download the collection and return its path. Subclasses should override this.

download_index(self, cachedir, url, sha256, index_directory_inside, index_cache_path_string, index_expected_document_count)[source]