capreolus.sampler
¶
Package Contents¶
Classes¶
Base class for profane modules. |
|
Samples training data triplets. Each samples is of the form (query, relevant doc, non-relevant doc) |
|
Samples training data pairs. Each sample is of the form (query, doc) |
|
Samples training data triplets. Each samples is of the form (query, relevant doc, non-relevant doc) |
|
Creates a Dataset for evaluation (test) data to be used with a pytorch DataLoader |
Attributes¶
- class capreolus.sampler.Sampler(config=None, provide=None, share_dependency_objects=False, build=True)[source]¶
Bases:
capreolus.ModuleBase
Base class for profane modules. Module construction proceeds as follows: 1) Any config options not present in config are filled in with their default values. Config options and their defaults are specified in the config_spec class attribute. 2) Any dependencies declared in the dependencies class attribute are recursively instantiated. If the dependency object is present in provide, this object will be used instead of instantiating a new object for the dependency. 3) The module object’s config variable is updated to reflect the configs of its dependencies and then frozen.
After construction is complete, the module’s dependencies are available as instance variables: self.`dependency key`.
- Parameters
config – dictionary containing a config to apply to this module and its dependencies
provide – dictionary mapping dependency keys to module objects
share_dependency_objects – if true, dependencies will be cached in the registry based on their configs and reused. See the share_objects argument of ModuleBase.create.
- prepare(qid_to_docids, qrels, extractor, relevance_level=1, **kwargs)[source]¶
params: qid_to_docids: A dict of the form {qid: [list of docids to rank]} qrels: A dict of the form {qid: {docid: label}} extractor: An Extractor instance (eg: EmbedText) relevance_level: Threshold score below which documents are considered to be non-relevant.
- class capreolus.sampler.TrainTripletSampler(config=None, provide=None, share_dependency_objects=False, build=True)[source]¶
Bases:
Sampler
,TrainingSamplerMixin
,torch.utils.data.IterableDataset
Samples training data triplets. Each samples is of the form (query, relevant doc, non-relevant doc)
- class capreolus.sampler.TrainPairSampler(config=None, provide=None, share_dependency_objects=False, build=True)[source]¶
Bases:
Sampler
,TrainingSamplerMixin
,torch.utils.data.IterableDataset
Samples training data pairs. Each sample is of the form (query, doc) The number of generate positive and negative samples are the same.
- class capreolus.sampler.LCETrainSampler(config=None, provide=None, share_dependency_objects=False, build=True)[source]¶
Bases:
TrainTripletSampler
Samples training data triplets. Each samples is of the form (query, relevant doc, non-relevant doc)