capreolus.extractor
¶
Submodules¶
capreolus.extractor.bagofwords
capreolus.extractor.bertpassage
capreolus.extractor.berttext
capreolus.extractor.birch_bertpassage
capreolus.extractor.common
capreolus.extractor.deeptileextractor
capreolus.extractor.embedtext
capreolus.extractor.lce_bertpassage
capreolus.extractor.pooled_bertpassage
capreolus.extractor.slowembedtext
Package Contents¶
Classes¶
Base class for Extractor modules. The purpose of an Extractor is to convert queries and documents to a representation suitable for use with a |
Attributes¶
- class capreolus.extractor.Extractor(config=None, provide=None, share_dependency_objects=False, build=True)[source]¶
Bases:
capreolus.ModuleBase
Base class for Extractor modules. The purpose of an Extractor is to convert queries and documents to a representation suitable for use with a
Reranker
module.- Modules should provide:
an
id2vec(qid, posid, negid=None)
method that converts the given query and document ids to an appropriate representation
- get_state_cache_file_path(qids, docids)[source]¶
Returns the path to the cache file used to store the extractor state, regardless of whether it exists or not
- is_state_cached(qids, docids)[source]¶
Returns a boolean indicating whether the state corresponding to the qids and docids passed has already been cached
- abstract id2vec(qid, posdocid, negdocid=None, label=None, *args, **kwargs)[source]¶
Creates a feature from the (qid, docid) pair. If negdocid is supplied, that’s also included in the feature (needed for training with pairwise hinge loss) Label is a vector of shape [num_classes], and is supplied only when using pointwise training (i.e cross entropy) When using pointwise samples, negdocid is None, and label is either [0, 1] or [1, 0] depending on whether the document represented by posdocid is relevant or irrelevant respectively.